In this example, we will walk through a possible use case of the nasapy
library by extracting the next 10 years of close-approaching objects to Earth identified by NASA's Jet Propulsion Laboratory's Small-Body Database.
Before diving in, import the packages we will use to extract and analyze the data. The data analysis library pandas will be used to wrangle the data, while seaborn is used for plotting the data. The magic command %matplotlib inline
is loaded to display the generated plots.
import nasapy
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
The close_approach
method of the nasapy
library allows one to access the JPL SBDB to extract data related to known meteoroids and asteroids within proximity to Earth. Setting the parameter return_df=True
automatically coerces the returned JSON data into a pandas DataFrame. After extracting the data, we transform several of the variables into float
type.
ca = nasapy.close_approach(date_min='2020-01-01', date_max='2029-12-31', return_df=True)
ca['dist'] = ca['dist'].astype(float)
ca['dist_min'] = ca['dist_min'].astype(float)
ca['dist_max'] = ca['dist_max'].astype(float)
The dist
column of the returned data describes the nominal approach distance of the object in astronomical units (AU). An astronomical unit, or AU is roughly the distance of the Earth to the Sun, approximately 92,955,807 miles or 149,598,000 kilometers. Using the .describe
method, we can display descriptive statistics that summarize the data.
ca['dist'].describe()
We see the mean approach distance in AUs is approximately 0.031, which we can transform into miles:
au_miles = 92955807.26743
ca['dist'].describe()['mean'] * au_miles
Thus the average distance of the approaching objects to Earth over the next decade is about 2.86 million miles, which is more than 10 times the distance from the Earth to the Moon (238,900 miles).
What about the closest approaching object to Earth within the next ten years? Using the .loc
method, we can find the object with the closest approaching distance.
ca.loc[ca['dist'] == ca['dist'].min()]
The closest approaching known object is expected to approach Earth near the end of the decade, on April 13, 2029, at a distance of 0.00023 AU. Transforming the astronomical units into miles, we can get a better sense of the approach distance of the object.
print('Distance: ' + str(au_miles * ca['dist'].min()))
print('Minimum Distance: ' + str(au_miles * ca['dist_min'].min()))
print('Maximum Distance: ' + str(au_miles * ca['dist_max'].min()))
Oh my! It looks like this object will approach Earth relatively close, at about 23,000 miles, in a range of [644, 23874] miles. For comparison, the maximum distance is about 1/10 of the distance from the Earth to the Moon.
Let's get a sense of the number of approaching objects to Earth by year over the next decade. First, we extract the year of the approach date using a combination of .apply
and to_datetime
into a new column approach_year
.
ca['approach_year'] = ca['cd'].apply(lambda x: pd.to_datetime(x).year)
Using the .groupby
method, we create a new DataFrame with the aggregated count of approaching objects for each year.
approaches_by_year = ca.groupby('approach_year').count().reset_index()
We can use seaborn's barplot function to plot the count of approaching objects for each year.
plt.figure(figsize=(10, 6))
p = sns.barplot(x='approach_year', y='h', data=approaches_by_year)
plt.axhline(approaches_by_year['h'].mean(), color='r', linestyle='--')
p = p.set_xticklabels(approaches_by_year['approach_year'], rotation=45, ha='right', fontsize=12)
Interestingly, this year (2020) will have the most activity, and then it will somewhat decline over the next few years until the end of the decade. On average, there are a little less than 80 Earth approaching objects each year of the decade.
As the last example, let's plot the distribution of the approaching object distances using seaborn's .kdeplot
which creates a kernel destiny plot. We can also add a mean line of the distances similar to how we did above.
plt.figure(figsize=(14, 6))
plt.axvline(ca['dist'].astype(float).mean(), color='r', linestyle='--')
sns.kdeplot(ca['dist'], shade=True)
As we noted above, the mean approach distance is a little more than 0.03, which we can see in the density plot above. Lastly, we can plot a normal distribution over the distribution of the distances using numpy.random.normal to get a quick comparison of the actual distribution compared to a normal one.
plt.figure(figsize=(14, 6))
x = np.random.normal(size=len(ca['dist']),
scale=ca['dist'].std(),
loc=ca['dist'].mean())
plt.axvline(ca['dist'].mean(), color='r', linestyle='--')
sns.kdeplot(ca['dist'], shade=True)
sns.kdeplot(x, shade=True)
We see the distribution of the distances is not quite normal. There are definitely much more sophisticated techniques for analyzing the distribution of a dataset, but that will be saved for a possible future exercise.
Related Posts
- Plot Earth Fireball Impacts with nasapy, pandas and folium
- Get All NASA Astronomy Pictures of the Day from 2019
- From Intake to Outcome: Analyzing the Austin Animal Center's Intake and Outcomes Datasets
- Austin Animal Center Intakes Exploratory Data Analysis with Python, Pandas and Seaborn
- Extract and Analyze the Seattle Pet Licenses Dataset