In this example, we will walk through a possible use case of the
nasapy library by extracting the next 10 years of close-approaching objects to Earth identified by NASA's Jet Propulsion Laboratory's Small-Body Database.
Before diving in, import the packages we will use to extract and analyze the data. The data analysis library pandas will be used to wrangle the data, while seaborn is used for plotting the data. The magic command
%matplotlib inline is loaded to display the generated plots.
import nasapy import pandas as pd import numpy as np import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline
close_approach method of the
nasapy library allows one to access the JPL SBDB to extract data related to known meteoroids and asteroids within proximity to Earth. Setting the parameter
return_df=True automatically coerces the returned JSON data into a pandas DataFrame. After extracting the data, we transform several of the variables into
ca = nasapy.close_approach(date_min='2020-01-01', date_max='2029-12-31', return_df=True) ca['dist'] = ca['dist'].astype(float) ca['dist_min'] = ca['dist_min'].astype(float) ca['dist_max'] = ca['dist_max'].astype(float)
dist column of the returned data describes the nominal approach distance of the object in astronomical units (AU). An astronomical unit, or AU is roughly the distance of the Earth to the Sun, approximately 92,955,807 miles or 149,598,000 kilometers. Using the
.describe method, we can display descriptive statistics that summarize the data.
count 729.000000 mean 0.030775 std 0.012718 min 0.000252 25% 0.021520 50% 0.031960 75% 0.041442 max 0.049983 Name: dist, dtype: float64
We see the mean approach distance in AUs is approximately 0.031, which we can transform into miles:
au_miles = 92955807.26743 ca['dist'].describe()['mean'] * au_miles
Thus the average distance of the approaching objects to Earth over the next decade is about 2.86 million miles, which is more than 10 times the distance from the Earth to the Moon (238,900 miles).
What about the closest approaching object to Earth within the next ten years? Using the
.loc method, we can find the object with the closest approaching distance.
ca.loc[ca['dist'] == ca['dist'].min()]
|678||99942||199||2462240.407032288||2029-Apr-13 21:46||0.000252||0.000248||0.000257||7.43332261672295||5.84141384414201||< 00:01||19.7|
The closest approaching known object is expected to approach Earth near the end of the decade, on April 13, 2029, at a distance of 0.00023 AU. Transforming the astronomical units into miles, we can get a better sense of the approach distance of the object.
print('Distance: ' + str(au_miles * ca['dist'].min())) print('Minimum Distance: ' + str(au_miles * ca['dist_min'].min())) print('Maximum Distance: ' + str(au_miles * ca['dist_max'].min()))
Distance: 23440.92769543333 Minimum Distance: 644.2481158331191 Maximum Distance: 23874.510393069424
Oh my! It looks like this object will approach Earth relatively close, at about 23,000 miles, in a range of [644, 23874] miles. For comparison, the maximum distance is about 1/10 of the distance from the Earth to the Moon.
Let's get a sense of the number of approaching objects to Earth by year over the next decade. First, we extract the year of the approach date using a combination of
to_datetime into a new column
ca['approach_year'] = ca['cd'].apply(lambda x: pd.to_datetime(x).year)
.groupby method, we create a new DataFrame with the aggregated count of approaching objects for each year.
approaches_by_year = ca.groupby('approach_year').count().reset_index()
plt.figure(figsize=(10, 6)) p = sns.barplot(x='approach_year', y='h', data=approaches_by_year) plt.axhline(approaches_by_year['h'].mean(), color='r', linestyle='--') p = p.set_xticklabels(approaches_by_year['approach_year'], rotation=45, ha='right', fontsize=12)
Interestingly, this year (2020) will have the most activity, and then it will somewhat decline over the next few years until the end of the decade. On average, there are a little less than 80 Earth approaching objects each year of the decade.
As the last example, let's plot the distribution of the approaching object distances using seaborn's
.kdeplot which creates a kernel destiny plot. We can also add a mean line of the distances similar to how we did above.
plt.figure(figsize=(14, 6)) plt.axvline(ca['dist'].astype(float).mean(), color='r', linestyle='--') sns.kdeplot(ca['dist'], shade=True)
<matplotlib.axes._subplots.AxesSubplot at 0x1a16c8a710>
As we noted above, the mean approach distance is a little more than 0.03, which we can see in the density plot above. Lastly, we can plot a normal distribution over the distribution of the distances using numpy.random.normal to get a quick comparison of the actual distribution compared to a normal one.
plt.figure(figsize=(14, 6)) x = np.random.normal(size=len(ca['dist']), scale=ca['dist'].std(), loc=ca['dist'].mean()) plt.axvline(ca['dist'].mean(), color='r', linestyle='--') sns.kdeplot(ca['dist'], shade=True) sns.kdeplot(x, shade=True)
<matplotlib.axes._subplots.AxesSubplot at 0x1a185db790>
We see the distribution of the distances is not quite normal. There are definitely much more sophisticated techniques for analyzing the distribution of a dataset, but that will be saved for a possible future exercise.
- Plot Earth Fireball Impacts with nasapy, pandas and folium
- Get All NASA Astronomy Pictures of the Day from 2019
- From Intake to Outcome: Analyzing the Austin Animal Center's Intake and Outcomes Datasets
- Austin Animal Center Intakes Exploratory Data Analysis with Python, Pandas and Seaborn
- Extract and Analyze the Seattle Pet Licenses Dataset