This function generates a dataframe similar to the
flights dataset from nycflights13
for any US airport and time frame. Please
note that, even with a strong internet connection, this function
may take several minutes to download relevant data.
Source
RITA, Bureau of transportation statistics, https://www.bts.gov
Arguments
- station
A character vector giving the origin US airports of interest (as the FAA LID airport code).
- year
A numeric giving the year of interest. This argument is currently not vectorized, as dataset sizes for single years are significantly large. Information for the most recent year is usually available by February or March in the following year.
- month
A numeric giving the month(s) of interest.
- dir
An optional character string giving the directory to save datasets in. By default, datasets will not be saved to file.
- ...
Currently only used internally.
Value
A data frame with ~1k-500k rows and 19 variables:
year, month, dayDate of departure
dep_time, arr_timeActual departure and arrival times, UTC.
sched_dep_time, sched_arr_timeScheduled departure and arrival times, UTC.
dep_delay, arr_delayDeparture and arrival delays, in minutes. Negative times represent early departures/arrivals.
hour, minuteTime of scheduled departure broken into hour and minutes.
carrierTwo letter carrier abbreviation. See
get_airlinesto get full nametailnumPlane tail number
flightFlight number
origin, destOrigin and destination. See
get_airportsfor additional metadata.air_timeAmount of time spent in the air, in minutes
distanceDistance between airports, in miles
time_hourScheduled date and hour of the flight as a
POSIXctdate. Along withorigin, can be used to join flights data to weather data.
Details
This function currently downloads data for all stations for each month
supplied, and then filters out data for relevant stations. Thus,
the recommended approach to download data for many airports is to supply
a vector of airport codes to the station argument rather than
iterating over many calls to get_flights().
Note
If you are repeatedly getting a timeout error when downloading flights,
this could be because your download is taking longer than the default timeout
R option. You can change the timeout value for your R session by running the
code options(timeout = timeout_value_in_seconds) in your console.
See also
get_weather for weather data,
get_airlines for airlines data,
get_airports for airports data,
get_planes for planes data,
or anyflights for a wrapper function.
Use the as_flights_package function to convert this dataset
to a data-only package.
Examples
# flights out of Portland International in June 2018
if (FALSE) get_flights("PDX", 2018, 6) # \dontrun{}
# ...or the original nycflights13 flights dataset
if (FALSE) get_flights(c("JFK", "LGA", "EWR"), 2013) # \dontrun{}
# use the dir argument to indicate the folder to
# save the data in \code{dir} as "flights.rda"
if (FALSE) get_flights("PDX", 2018, 6, dir = tempdir()) # \dontrun{}
