Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moved the parquet dataset away from the hive managed folder #278

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

rchallapalli
Copy link
Contributor

Krystal,

For some reason, we are seeing that tpcds data is being deleted. Abhishek suggested that since the hive tables are anyway external tables, there is no reason to keep them under /user/hive/warehouse/ folder. So I manually copied the data to a different folder and updated the hive scripts. Can you take a look?

Rahul

…y external tables, there is no reason to maintain them under a hive managed folder
@rchallapalli
Copy link
Contributor Author

Once you give a +1, I will wait for others to move data before committing this change

@krystaln
Copy link
Contributor

LGTM +1

@rchallapalli
Copy link
Contributor Author

Thanks for the review krystal. I updated the README file to point to the files on AWS. Take a look if you have some time

@rchallapalli
Copy link
Contributor Author

@agirish I will wait until you copy all the data onto the relevant clusters

@agirish
Copy link
Member

agirish commented Dec 5, 2017

@prasadns14 , can you please take a look and work on any pending tasks here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants