Skip to content

Latest commit

 

History

History
49 lines (42 loc) · 2.76 KB

bigdatabrazil_topiclist.markdown

File metadata and controls

49 lines (42 loc) · 2.76 KB

Big Data Brazil meet up

Rule book

  1. there are no rules, just suggestions
  2. avoid presentations at all costs, instead prepare either a script that you can work through or, even better, a reproducible document with brief explanations and code
  3. feel free to re-use material and data available from the Web, but provide references and give credit to the authors

Topic list

  1. sublime text and Rmarkdown for reproducible research: knitr, pander, pandoc, table generation - Ricardo (done)
  2. git, github and gists for reproducible research - Ricardo (done)
  3. Jekyll for the dissemination of research information - Ricardo (done)
  4. Data mining with rattle - Ricardo (code selected)
  5. ggplot2 for graphical exploratory analysis - Ricardo (code selected)
  6. slidify, dot language, dynamic graphics (rcharts, animations, googlevis) and basic css for reproducible slides - Ricardo (code selected)
  7. basics of R data management - Ricardo (skeleton in place)
  8. bare bones of R programming and package development - Ricardo (code selected)
  9. data posting to github and other data repositories, API-retrieval (JSON and XML), and NLTK processing using python - Ricardo
  10. data simulation and analysis of randomized experiments - Ricardo
  11. predictive modeling and assessment statistics - Ricardo
  12. dissecting linear regression through matrix manipulation - Ricardo
  13. predictive modeling and assessment statistics - Ricardo (code selected)
  14. data gathering from github and other data repositories - Ricardo (code selected)
  15. data simulation - Ricardo (code selected)
  16. Rcpp for C loops in R for function optimization - Ricardo (code selected)
  17. dissecting linear regression through matrix manipulation - Ricardo (code selected)
  18. web searching hacks for multilanguage data and information retrieval - Ricardo
  19. basic Bayesian regression models - Ricardo
  20. basic UX (User eXperience) for dissemination of research information - Ricardo
  21. API-retrieval (JSON and XML), and NLTK processing using python
  22. mongoDB - Jacson??
  23. R interaction with RDF and LOD - Jacson??
  24. hadoop - Jose Eduardo??
  25. Bayesian nets - Jose Eduardo??
  26. Concerto, IRT and CAT - Joao (remote)??
  27. sqlliteR for data management - Elias??
  28. Global Portuguese for article writing - Katia??
  29. knitcitations and bibtex embedded in markdown for reproducible research - Elias??
  30. CFA and SEM with psych and lavaan - Joao (remote)??
  31. MongoDB API, mongoHQ, and queries - Marcelo??
  32. D2RQ and sql-rdf conversion - Marcelo??
  33. QCA (logic and fuzzy logic) for qualitative studies - Joao??
  34. text mining with tm and NLTK - Jacson??
  35. R and BIs - Jacson??

This document is licensed under the DILLIGAS public license, which allows you to freely copy and modify it for any purpose, commercial or not.