Urban Mobility Dataset
Overview
In this guide, you’ll learn how to use the Oxla database with Urban Mobility datasets through various sample queries. These datasets include time-series data from New York City Taxi and For-Hire Vehicle Data. It contains several tables, such as trips, taxi zones, and weather observations.
Data Source: https://github.com/toddwschneider/nyc-taxi-data .
Ensure you’ve created the Data Storage with the appropriate demo dataset, the Cluster is running, and you’re connected to it using the PostgreSQL client. For more details, refer to the Quickstart - Oxla SaaS page.
Datasets Structure
Let’s explore the tables and their structures to better understand and fully utilize the Urban Mobility dataset.
cab_types Table
cab_types Table
Attribute Name | Type |
---|---|
id | Bigint |
type | Text |
central_park_weather_observations Table
central_park_weather_observations Table
Attribute Name | Type |
---|---|
station_id | Text |
station_name | Text |
date | Date |
precipitation | Double precision |
snow_depth | Double precision |
snowfall | Double precision |
max_temperature | Double precision |
min_temperature | Double precision |
average_wind_speed | Double precision |
fhv_bases Table
fhv_bases Table
Attribute Name | Type |
---|---|
base_number | Text |
base_name | Text |
dba | Text |
dba_category | Text |
fhv_trips Table
fhv_trips Table
Attribute Name | Type |
---|---|
id | Bigint |
hvfhs_license_num | Text |
dispatching_base_num | Text |
originating_base_num | Text |
request_datetime | Timestamp without time zone |
on_scene_datetime | Timestamp without time zone |
pickup_datetime | Timestamp without time zone |
dropoff_datetime | Timestamp without time zone |
pickup_location_id | Bigint |
dropoff_location_id | Bigint |
trip_miles | Double precision |
trip_time | Double precision |
base_passenger_fare | Double precision |
tolls | Double precision |
black_car_fund | Double precision |
sales_tax | Double precision |
congestion_surcharge | Double precision |
airport_fee | Double precision |
tips | Double precision |
driver_pay | Double precision |
shared_request | Boolean |
shared_match | Boolean |
access_a_ride | Boolean |
wav_request | Boolean |
wav_match | Boolean |
legacy_shared_ride | Bigint |
affiliated_base_num | Text |
fhv_trips_staging Table
fhv_trips_staging Table
Attribute Name | Type |
---|---|
hvfhs_license_num | Text |
dispatching_base_num | Text |
originating_base_num | Text |
request_datetime | Timestamp without time zone |
on_scene_datetime | Timestamp without time zone |
pickup_datetime | Timestamp without time zone |
dropoff_datetime | Timestamp without time zone |
pickup_location_id | Bigint |
dropoff_location_id | Bigint |
trip_miles | Double precision |
trip_time | Double precision |
base_passenger_fare | Double precision |
tolls | Double precision |
black_car_fund | Double precision |
sales_tax | Double precision |
congestion_surcharge | Double precision |
airport_fee | Double precision |
tips | Double precision |
driver_pay | Double precision |
shared_request_flag | Text |
shared_match_flag | Text |
access_a_ride_flag | Text |
wav_request_flag | Text |
wav_match_flag | Text |
shared_ride_flag | Text |
affiliated_base_num | Text |
legacy_shared_ride_flag | Text |
green_tripdata_staging Table
green_tripdata_staging Table
Attribute Name | Type |
---|---|
id | Bigint |
vendor_id | Bigint |
lpep_pickup_datetime | Timestamp without time zone |
lpep_dropoff_datetime | Timestamp without time zone |
store_and_fwd_flag | Text |
rate_code_id | Bigint |
dropoff_location_id | Bigint |
congestion_surcharge | Double precision |
passenger_count | Bigint |
trip_distance | Double precision |
fare_amount | Double precision |
extra | Double precision |
mta_tax | Double precision |
tip_amount | Double precision |
tolls_amount | Double precision |
ehail_fee | Double precision |
improvement_surcharge | Double precision |
total_amount | Double precision |
payment_type | Bigint |
trip_type | Bigint |
pickup_location_id | Bigint |
hvfhs_licenses Table
hvfhs_licenses Table
Attribute Name | Type |
---|---|
license_number | Text |
company_name | Text |
nyct2010 Table
nyct2010 Table
Attribute Name | Type |
---|---|
gid | Bigint |
ctlabel | Text |
borocode | Text |
boroname | Text |
ct2010 | Text |
boroct2010 | Text |
cdeligibil | Text |
ntacode | Text |
ntaname | Text |
puma | Text |
shape_leng | Double precision |
shape_area | Double precision |
geom | Text |
nyct2010_taxi_zones_mapping Table
nyct2010_taxi_zones_mapping Table
Attribute Name | Type |
---|---|
nyct2010_gid | Bigint |
taxi_zone_location_id | Integer |
overlap | Double precision |
spatial_ref_sys Table
spatial_ref_sys Table
Attribute Name | Type |
---|---|
srid | Bigint |
auth_name | Text |
auth_srid | Bigint |
srtext | Text |
proj4text | Text |
trips Table
trips Table
Attribute Name | Type |
---|---|
id | Bigint |
cab_type_id | Bigint |
vendor_id | Bigint |
pickup_datetime | Timestamp without time zone |
dropoff_datetime | Timestamp without time zone |
store_and_fwd_flag | Boolean |
rate_code_id | Bigint |
pickup_longitude | Double precision |
pickup_latitude | Double precision |
dropoff_longitude | Double precision |
dropoff_latitude | Double precision |
passenger_count | Bigint |
trip_distance | Double precision |
fare_amount | Double precision |
extra | Double precision |
mta_tax | Double precision |
tip_amount | Double precision |
tolls_amount | Double precision |
ehail_fee | Double precision |
improvement_surcharge | Double precision |
congestion_surcharge | Double precision |
airport_fee | Double precision |
total_amount | Double precision |
payment_type | Bigint |
trip_type | Bigint |
pickup_nyct2010_gid | Bigint |
dropoff_nyct2010_gid | Bigint |
pickup_location_id | Bigint |
dropoff_location_id | Bigint |
taxi_zones Table
taxi_zones Table
Attribute Name | Type |
---|---|
gid | Bigint |
objectid | Bigint |
shape_leng | Double precision |
shape_area | Double precision |
zone | Text |
locationid | Integer |
borough | Text |
geom | Text |
uber_trips_2014 Table
uber_trips_2014 Table
Attribute Name | Type |
---|---|
id | Bigint |
pickup_datetime | Timestamp without time zone |
pickup_latitude | Double precision |
pickup_longitude | Double precision |
base_code | Text |
yellow_tripdata_staging Table
yellow_tripdata_staging Table
Attribute Name | Type |
---|---|
id | Bigint |
vendor_id | Text |
tpep_pickup_datetime | Timestamp without time zone |
tpep_dropoff_datetime | Timestamp without time zone |
passenger_count | Bigint |
trip_distance | Double precision |
pickup_longitude | Double precision |
pickup_latitude | Double precision |
rate_code_id | Text |
store_and_fwd_flag | Text |
dropoff_longitude | Double precision |
dropoff_latitude | Double precision |
pickup_location_id | Bigint |
dropoff_location_id | Bigint |
payment_type | Text |
fare_amount | Double precision |
extra | Double precision |
mta_tax | Double precision |
tip_amount | Double precision |
tolls_amount | Double precision |
improvement_surcharge | Double precision |
total_amount | Double precision |
congestion_surcharge | Double precision |
airport_fee | Double precision |
Sample Queries
Sample 1: Number of Trips by Payment Type
This example shows the number of trips for each payment type used in the dataset.
The query displays number of trips grouped by payment type.
Sample 2: Calculate Revenue per Pickup Point
This example calculates the top 10 pickup locations with the highest revenue.
The result displays the top 10 pickup locations, with JFK Airport having the highest revenue.
Sample 3: Manhattan’s Most Popular Cabs
This example provides insights into Manhattan’s most popular cabs by showing key statistics for each cab type.
The result shows the shortest and farthest trip distances, average trip distance, and average fare for each cab type.
Sample 4: How Weather Impacts the Transportation Demand Analysis
Here we analyze the average temperature and wind speed from January to April on the 1st and 2nd of each month.
The output helps us understand how weather conditions from January to April impact the number of trips taken on specific dates. It shows that days with higher temperatures and stronger winds see more trips happening.