-
Notifications
You must be signed in to change notification settings - Fork 128
Initial auto config annual BEAGDP #1772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Initial auto config annual BEAGDP #1772
Conversation
Summary of ChangesHello @vishalworkdatacommon, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request establishes a new, automated data ingestion pipeline for annual Gross Domestic Product (GDP) data from the U.S. Bureau of Economic Analysis (BEA). It introduces a Python script to manage the download, extraction, and initial cleaning of the raw data, ensuring consistency and readiness for further processing. Complementing this, a comprehensive set of configuration files (manifest, pv_maps, metadata, and remap files) has been added to precisely define how various BEA GDP datasets are transformed and mapped into statistical variables within the Data Commons framework, enabling regular and structured updates of this critical economic indicator. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new data import for BEA Annual GDP, including a download script, manifest, and configuration files. The overall structure is good, but there are several areas for improvement. The download script uses incorrect type hints (NoReturn instead of None) and has overly broad exception handling. The manifest.json is highly repetitive and could be greatly simplified by using a helper shell script. Additionally, there are potential correctness issues in the pvmap configuration files related to data filtering and inconsistencies across the metadata files that should be addressed to improve maintainability and correctness.
statvar_imports/us_bea/annual_gdp/pv_map/sagdp1__all_areas_1997_2024_pvmap.csv
Outdated
Show resolved
Hide resolved
statvar_imports/us_bea/annual_gdp/pv_map/sagdp1__all_areas_1997_2024_metadata.csv
Show resolved
Hide resolved
0e2efe7 to
22afed3
Compare
da30488 to
0504811
Compare
|
Hi @vishalworkdatacommon, before I get into the detailed review - Can you help me understand why this import has so many pv-maps? For example, I see a lot of files with pattern |
|
Please find the details below : The reason for the high number of pv-maps is that while the files share a similar structure (the Here is a quick breakdown of how the files differ:
By keeping these as separate pv-maps, we ensure that the data is ingested into the correct schema fields without "clashing" or overwriting different types of economic indicators. |
|
@vishalworkdatacommon Thanks for the info. Can you please document this in README file as well for future readers? |
|
Sure adding it in readme file to. |
statvar_imports/us_bea/annual_gdp/pv_map/sagdp1__all_areas_1997_2024_pvmap.csv
Outdated
Show resolved
Hide resolved
I have to take them as reference - but that is county level and but this state level as per understanding and knowledge |
No description provided.