DBT 30 Days Roadmap
DBT 30 Days Roadmap
For data analysts, working with APIs is a great way to access real-time, dynamic data for projects, which is more representative of real-world business scenarios than static datasets. Instead of simply analyzing a pre-packaged file, you'll need to think critically about what data is available, how to get it, and what business questions it can answer.
Working with APIs is a great way to access real-time, dynamic data. Instead of analyzing static files, you'll think critically about data availability, retrieval, and business impact.
1. Financial Modeling Prep 📊
The Source: A free API providing real-time and historical stock data.
Objective: Correlate stock price movements with earnings report dates.
⭐ S.T.A.R. Framework
Phase | Description |
Situation | A financial publication needs data-driven articles on market reactions to company news. |
Task | Identify the impact of quarterly earnings reports on the stock price of major tech companies (e.g., Apple). |
Action | Data Collection: Use the Financial Modeling Prep API to retrieve daily stock price data (e.g., historical closing prices) for a chosen company.
|
Result | A quantifiable relationship showing market performance 3 days before/after events. |
!TIP!
Key Metrics to Calculate:
Volatility Index: Standard deviation of price during the 7-day event window.
Abnormal Return: Difference between the stock's return and the market index return.
2. Strava API 🏃
The Source: Fitness data including activities and health metrics.
Objective: Inform city infrastructure projects by identifying popular transit routes.
⭐ S.T.A.R. Framework
The Strava API connects to fitness data, including activities and health data. This is useful for projects related to health, wellness, and consumer behavior.
S.T.A.R. Framework
Situation: A new urban planning consulting firm wants to use fitness data to inform city infrastructure projects. I need to demonstrate our analytical capabilities by identifying popular active transport routes.
Task: Analyze public Strava data for a major metropolitan area to identify the most popular cycling and running routes. The goal is to provide data-driven recommendations to a city's transportation department on where to prioritize new bike lanes or pedestrian paths.
Action:
Data Collection: Access the Strava API to gather anonymized activity data for a city. Filter the data to focus on cycling and running activities.
Geospatial Analysis: Use a geospatial tool (like a Python library or GIS software) to plot the activity routes on a map. Aggregate and count the number of activities that pass through specific segments of the city's road network.
Trend Identification: Identify the highest-density areas of activity. Analyze the data to see if these popular routes are currently well-served by existing infrastructure (e.g., bike lanes, wide sidewalks) or if they present opportunities for new development.
Recommendation: Formulate a report that highlights the top 3-5 high-traffic corridors and provide actionable recommendations, such as "adding a dedicated bike path along [Street X] would serve the city’s most popular cycling route."
3. Open Brewery DB API 🍺
The Source: Data on breweries, cideries, and craft beer shops.
Objective: Identify underserved markets for a national distributor.
⭐ S.T.A.R. Framework
The Open Brewery DB API provides data on breweries, cideries, and craft beer bottle shops.
Situation: A national craft beer distributor is looking to expand into new markets. The company wants to use data to identify underserved cities with a high potential for growth.
Task: Identify a city that is currently underserved by craft breweries but shows high demand for beer-related businesses. The goal is to provide a compelling, data-backed recommendation for the next market expansion.
Action:
Data Collection: Use the Open Brewery DB API to retrieve a list of all breweries in the United States, including their location data (city, state).
Demand Proxy Analysis: Use an external data source, like the U.S. Census Bureau API or a static dataset, to find the population of each city. This population data can serve as a proxy for demand.
Market Saturation Calculation: Calculate the "brewery density" for each city by dividing the number of breweries by the population. Sort the results to find cities with a low density of breweries.
Step 1: Pull US brewery lists and locations via the API.
Step 2: Use a Demand Proxy (e.g., U.S. Census Bureau API) to find city populations.
Step 3: Calculate Brewery Density
Step 4: Rank cities with the lowest density but highest growth potential.
Recommendation: Present a case for a specific city. The recommendation would be supported by the low brewery density data from the API and could be further strengthened by external data on factors like income levels or a growing young professional population.
4. OpenSea API 🖼️
The Source: The world’s largest NFT marketplace.
Objective: Differentiate between "hype" and "sustainable value" in digital assets.
⭐ S.T.A.R. Framework
The OpenSea API gives you access to data from the largest NFT marketplace.
Situation: A market analyst for a digital art investment fund needs to understand the current NFT landscape to inform potential acquisitions. I’m responsible for identifying and analyzing the most valuable and stable collections.
Task: Analyze the top NFT collections to find correlations between sales volume and floor price. The goal is to differentiate between fleeting, hype-driven collections and those with sustainable market activity and value.
Action:
Data Collection: Use the OpenSea API to retrieve data on the top NFT collections, including total sales volume, floor price, and number of owners over time.
Trend Analysis: Analyze the historical data for each collection. Plot the floor price against the sales volume on a time-series chart. Look for patterns: do large spikes in volume correspond to significant increases in floor price? Do collections with consistent volume have more stable prices?
Comparative Analysis: Compare multiple collections against each other. For example, analyze a well-established collection like CryptoPunks against a newer, trending one. Calculate metrics like price-to-volume ratio to find collections that maintain high value with less trading activity, suggesting stronger holding power among owners.
Recommendation: Provide a report that ranks collections based on stability and long-term value potential, supported by your analysis of the relationship between price and volume.
Python
5. NewsAPI.ai 📰
Situation: A public relations agency needs to provide a client with a real-time media monitoring dashboard for a new product launch. I've been asked to build a proof-of-concept that shows how to track and analyze media sentiment.
Task: Track news articles related to a specific product launch and analyze their sentiment (positive, negative, neutral) over the first week to gauge public perception.
Action:
⭐ S.T.A.R. Framework
Step | Process |
1. Collection | Pull articles mentioning a specific product name during a launch window. |
2. Analysis | Use NLP (Natural Language Processing) to categorize articles as Positive, Negative, or Neutral. |
3. Visualization | Line chart showing the "Sentiment Shift" over the first 7 days of a launch. |
4. Reporting | Quantify the shift (e.g., "Positive sentiment dropped 15% after Day 3"). |
Join our Community Forum
Any other questions? Get in touch