-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exploring and Visualizing Mixed Data in R with ggplot2 #606
Comments
I confirm @rogorido and @nabsiddiqui shared with me access to their repository containing all the required files, and that I handed them over to @anisa-hawes to allow the publishing team to generate the preview, thanks. |
Hello Giulia @semanticnoodles, Igor @rogorido and Nabeel @nabsiddiqui, Many thanks for sharing the lesson submission materials with me. I've now checked the Markdown file, and add some key elements of metadata. I've also checked the accompanying images and assets, ensuring each element meets our requirements. You can find the key files here:
You can review a Preview of the lesson here: -- A few initial notes:
|
Hello again Igor @rogorido and Nabeel @nabsiddiqui. What's happening now?Your lesson has been moved to the next phase of our workflow which is Phase 2: Initial Edit. In this Phase, your editor Giulia @semanticnoodles will read your lesson, and provide some initial feedback. Giulia will post feedback and suggestions as a comment in this Issue, so that you can revise your draft in the following Phase 3: Revision 1. %%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
} } }%%
timeline
Section Phase 1 <br> Submission
Who worked on this? : Publishing Manager (@anisa-hawes)
All Phase 1 tasks completed? : Yes
Section Phase 2 <br> Initial Edit
Who's working on this? : Editor (@semanticnoodles)
Expected completion date? : April 20
Section Phase 3 <br> Revision 1
Who's responsible? : Authors (@rogorido + @nabsiddiqui)
Expected timeframe? : ~30 days after feedback is received
|
@anisa-hawes Thanks for your comments. As for the tsv file: no, it is not required. It can be deleted. I'll add the alternative captions. Thanks. |
I added captions and alt texts (10a6a9e), but Nabeel should take a look whether it looks 'Englishly' enough... |
Hello @rogorido and @nabsiddiqui, here follows my preliminary feedback; I am aware it is quite extensive, but I believe these indications could help you strengthen your tutorial. If you need any clarification, please do not hesitate to ask! Overall feedbackIn general, your tutorial provides valuable guidance on navigating and producing a wide range of visualisations, effectively walking through the various features of Usability: Enhancing the logical structure of the lessonIn my opinion, this is the most critical point to consider. The tutorial lacks a cohesive element to tie its components together and the organisation of the content could benefit from a more linear and less convoluted approach. The case study you propose (sister cities) seems to be just a tool to obtain a series of visualisations. This is fair enough, but it could benefit from further methodological contextualisation and unpacking: the people following your tutorial may not be historians not have a clear understanding of the methods you are using -- although they can be familiar with R. In terms of improving the overall content, I think there are two possible directions for you to consider: either revising the content to follow a visualisation task-based narrative or placing more emphasis on the structure of the case study. The first option would privilege the visualisation tasks (but still require some methodological support for the case study), while the second would require you to generate stronger and sharper research questions from the case study, to be answered (at least in part) by the visualisation tasks. I think @nabsiddiqui did a very good job of structuring the content in the lesson Data Wrangling and Management in R, so I would recommend keeping that in mind as a reference. The title of the proposal could benefit from being more specific - or at least mentioning the context of application. The table of contents looks unbalanced: the headings and their actual wording could be better aligned with the content they cover, and the nesting could be more linear. You give very clear information about the concept of the grammar of graphics - this is really the cornerstone of understanding how Sustainability: Critically reviewing the data analysis narrativeThe dataset looks more than adequate for the visualisation tasks you have set as objectives, but the data narrative and its wording could benefit from further tuning. What you offer in this lesson is mostly visualisation of data distributions and there is little statistical testing involved. As your topic is sister cities, it makes perfect sense to talk about relationships, although what you observe are mostly trends or tendencies that you could try to explain through further research; sometimes you clearly point that out and sometimes it looks rather implicit. I think this is just a matter of fine-tuning the language, nothing more. Section-specific feedbackPara stands for paragraph number; please refer to the preview generated by @anisa-hawes Introduction, Lesson Goals and Data
ggplot2: General Overview
Sister cities in Europe
Loading Data with
|
@semanticnoodles thanks for your extensive comments. I will have a look at the enhancements you're proposing in the next days. |
What's happening now?Hello Igor @rogorido and Nabeel @nabsiddiqui. Your lesson has been moved to the next phase of our workflow which is Phase 3: Revision 1. This Phase is an opportunity for you to revise your draft in response to @semanticnoodles's initial feedback. You can make direct commits to your file here: /en/drafts/originals/exploring-visualizing-mixed-data-r-ggplot2.md. @charlottejmc or I are here to help if you encounter any practical problems! When both of you + Giulia are happy with the revised draft, we will move forward to Phase 4: Open Peer Review. %%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
} } }%%
timeline
Section Phase 2 <br> Initial Edit
Who worked on this? : Editor (@semanticnoodles)
All Phase 1 tasks completed? : Yes
Section Phase 3 <br> Revision 1
Who's working on this? : Authors (@rogorido + @nabsiddiqui)
Expected completion date? : May 17
Section Phase 4 <br> Open Peer Review
Who's responsible? : Reviewers (TBC)
Expected timeframe? : ~60 days after request is accepted
|
Hello Igor @rogorido and Nabeel @nabsiddiqui, I hope you are doing well! Just checking in with you about the draft revision (Phase 3 / Revision 1) as the deadline of the 17th of May has passed. If you need some extra time let me know approximately how much, so we can set up a new deadline -- and @anisa-hawes or @charlottejmc can update the Mermaid timeframe. If you have doubts or need any clarification, please do not hesitate to keep in touch. |
Hello @semanticnoodles, I have tried to rework a lot of the tutorial. I feel that changing some of the headings will make the flow more obvious. Let me see if it makes sense the way I have done it or if there should be additional changes. Here are some of what I reviewed based on your timeline. The rest I will leave to @rogorido unless he has an objection: Introduction, Lesson Goals and Data
ggplot2: General Overview
Sister cities in Europe
Loading Data with
|
Thank you, @nabsiddiqui! @semanticnoodles will review these revisions and advise if we are ready to move onwards to the next Phase of the workflow (which will be Phase 4 Open Peer Review). Giulia is away this week, returning on June 3rd. In the meantime, @charlottejmc and I can help with ensuring that functions and arguments are typographically consistent. These are aspects we always check as part of typesetting at Phase 6, but we'll do a quick scan now so that this isn't a distraction for Reviewers. |
Hello @nabsiddiqui and @semanticnoodles, I've made some adjustments to add backticks to functions, arguments and other parts of code, trying to stay consistent with our house style. |
Hello everybody, I am back! While I was away I got the chance to go through the tutorial and I can say you did upgrade the lesson quite a lot. Brilliant work @nabsiddiqui and @rogorido -- and many many thanks to @charlottejmc and @anisa-hawes for their support! I will take another quick reading as I think I spotted another couple of small things to fix, but I believe now it is almost ready to move onwards to Phase 4. Sorry for the slight delay in my answer -- I will get back to you in a few hours.🖥 |
It took longer than expected (hours became days..). Nevertheless, if @rogorido and @nabsiddiqui can quickly fix the elements in the list below I believe we can move to the open peer review (Phase 4). The most urgent is the first element, the following are about simple formalities/typos.
Thank you for the patience! |
@semanticnoodles (and @nabsiddiqui): I have already corrected all typos (I hope). And I have the correct dataset. But my question is: where should I exactly upload it? Many thanks for your work! |
@justinwigard and @regan008 Thank you very much for the detailed corrections! |
Hi @rogorido and @nabsiddiqui, here is my review/feedback summary (it took a while); thanks a million @regan008 and @justinwigard for all the food for thought and complementary feedback you provided! Both of you highly recommend the lesson for publication 🎉🎉🎉: @regan008 appreciates particularly the explanations about the tibbles and the Grammar of Graphics; on the other hand, @justinwigard appreciates the engaging tone of the lesson and the way it explains the potential of ggplot2. Here is a quick recap of the core elements you highlight -- that I recommend @rogorido & @nabsiddiqui to go through carefully. Notes on Amanda’s feedback@regan008 makes some detailed comments about typos, potential clarifications (e.g., on plotting packages, coordinate systems), and a suggestion to link out where ECDF is mentioned, clarifying the contents of para 56-60. She also notes that while maps are mentioned, the lesson does not cover them explicitly (might be a chance to link to Using Geospatial Data to Inform Historical Research in R). There may be an opportunity to use additional line charts, as @regan008 suggests, but requiring further transformations/brand new additions, e.g. using long/lat or population size between sister cities. The structure of the lesson works and I would like you to prioritise the refinements she suggests rather than adding brand new extensions. She makes a good point, but please only add additional data filtering/visualisation if you have time to devote to the task. Notes on Justin’s feedback@justinwigard highlights a number of areas where the lesson is already strong, as well as offering thoughtful suggestions for improvement under the four sections he articulated. Surely the minor typographical and grammatical suggestions other than the consistency of the sister cities spelling and the geoms require your attention. On a functional level, he notes that some additional context could be helpful for readers unfamiliar with the tidyverse or Wikidata. He noted that providing counter-examples alongside some of the figures, like Figure 6, could help readers compare different cases, as well as adding more references on the choice of binwidth size (very often a rule of thumb, in my experience). He additionally suggests listing the tidyverse packages explicitly, and including a link to Wikidata, making more evident the line about the dataset download. He also suggests incorporating a screenshot to show how the tibble should appear after loading (I believe I suggested you to consider something similar previously, like running Again, as I noted in Amanda's feedback, please focus on refinement/consolidation first, and then consider expanding your lesson further. A few extrasHere are a few extra comments from my side, mostly technically oriented.
> colnames(eudata)
[1] "X"
[2] "origincityLabel"
[3] "origincountry"
[4] "originlat"
[5] "originlong"
[6] "originpopulation"
[7] "sistercityLabel"
[8] "destinationlat"
[9] "destinationlong"
[10] "destinationpopulation"
[11] "destination_countryLabel"
[12] "dist"
[13] "eu"
[14] "samecountry"
[15] "typecountry
A huge thank you for all your patience and hard work!🌟 |
Hello Igor @rogorido and Nabeel @nabsiddiqui, What's happening now?Your lesson has been moved to the next phase of our workflow which is Phase 5: Revision 2. This phase is an opportunity for you to revise your draft in response to the peer reviewers' feedback. Giulia @semanticnoodles has summarised their suggestions, but feel free to ask questions if you are unsure. Please make revisions via direct commits to your file: /en/drafts/originals/visualizing-data-with-r-and-ggplot2.md. @charlottejmc and I are here to help if you encounter any difficulties. When you and Giulia are all happy with the revised draft, the Managing Editor @hawc2 will read it through and provide additional feedback/suggestions as necessary before we move forward to Phase 6: Sustainability + Accessibility. %%{init: { 'logLevel': 'debug', 'theme': 'dark', 'themeVariables': {
'cScale0': '#444444', 'cScaleLabel0': '#ffffff',
'cScale1': '#882b4f', 'cScaleLabel1': '#ffffff',
'cScale2': '#444444', 'cScaleLabel2': '#ffffff'
} } }%%
timeline
Section Phase 4 <br> Open Peer Review
Who worked on this? : Reviewers (@justinwigard + @regan008)
All Phase 4 tasks completed? : Yes
Section Phase 5 <br> Revision 2
Who's working on this? : Authors (@rogorido + @nabsiddiqui)
Expected completion date? : October 24
Section Phase 6 <br> Sustainability + Accessibility
Who's responsible? : Publishing Team
Expected timeframe? : 7~21 days
|
@semanticnoodles thanks for your review/feedback.We will make all corrections in the next days. |
@regan008 and @justinwigard: Many thanks again for your comments and corrections. I have added many of them (cdfc89f) and @nabsiddiqui and I should think about two or three changes you are proposing which have maybe more profound consequences for the tutorial. In any case, just some comments:
In any case, we will still work in some on your comments (@semanticnoodles). Many thanks again. |
Thank you for your work so far, Igor @rogorido and Nabeel @nabsiddiqui ✨ Please let Giulia @semanticnoodles know when you feel you've completed the revisions. She will read through the draft again to confirm that she's satisfied with the suggestions integrated. |
@anisa-hawes yes will do it! |
Hello @rogorido, @anisa-hawes, and @semanticnoodles, Igor and I have added our edits, and I believe that we are all set to move to the next stage now. |
Thank you, @nabsiddiqui and @rogorido. Giulia @semanticnoodles will read through your revisions later this week, and advise if she feels any further adjustments are needed. After that, Alex will read it through and share additional feedback/suggestions as necessary. When both Giulia and Alex are happy, we will move forward to Phase 6: Sustainability + Accessibility which will begin with copyediting 🙂 |
@anisa-hawes OK, many thanks! |
Hello @rogorido & @nabsiddiqui, I apologise for the delay in posting this feedback. I have been going through the whole lesson again with @justinwigard and @regan008 comments at hand. I think you have done a wonderful job of polishing the lesson, we are almost ready for Phase 6! 🎉 Please review the following points and we will be ready to move on - looking forward to seeing this brilliant lesson of yours available to the PH audience! General Comments
Paragraph-specific comments
|
@semanticnoodles Thanks a lot for your comments. We will work on your corrections and I hope we will be ready in 2-3 days. |
Hello @semanticnoodles. @rogorido and I have finished our edits. I have set a seed in the R code to allow for reproducibility. I have also updated the images to reflect the sample data the user will get due to the seed. For the title, we were thinking perhaps "From Historical Data to Visual Analytics: The Grammar of Graphics in Practice"? I don't know what would be needed to change the title since the folders are based on the title. I am sure @anisa-hawes can help. Look forward to moving this ahead. |
Thank you, @nabsiddiqui. Yes, of course we can help with the practicalities of adjustments to any file and directory names. However, I think what Giulia @semanticnoodles is aiming towards is finding a title that is more specific. Fundamentally, we want to help readers find lessons that meet their learning goals. A clear title facilitates discovery through search, and offers a quick, basic sense of what can be learned. Reviewing our lesson directory, I think the most successful titles generally comprise:
The current title is: Visualizing Data with R and ggplot2 I was wondering whether your title could clarify what kind of data readers are handling with these methods? The concept of Sister Cities is mentioned but what are you describing in general: demographic data? geographical/spatial data? ('mixed' data? - is the fact that you are selecting methods to visualise a range of different data types the key? 🤔) My sense is that an effective lesson title is usually simple and succinct. So, I think I'd suggest avoiding the semicolon and compound structure (more often encountered for an expanded research article title) and focus on providing straight-forward keys to the lesson. |
@anisa-hawes After talking with @nabsiddiqui I think we stick to the title proposed by Giulia. |
to @anisa-hawes' point, it would be nice to clarify what type of data this lesson teaches how to visualize - would it be fair to label it "Demographic Data"? |
I think it is more mixed data since some of it is about the cities themselves and some of it is about the demographics of the city. I like "Exploring and Visualizing Mixed Data in R with ggplot2". @rogorido is this ok with you? |
@nabsiddiqui Yes, perfect! |
Hi everybody, thank you @anisa-hawes and @hawc2 for stimulating these productive exchanges 🧠! The title solution you settled with sounds quite good to me; let us know your thoughts, @hawc2 and @anisa-hawes. Thanks a lot for fixing the last items, @rogorido & @nabsiddiqui, I highly appreciated you adding the seed for reproducibility 👏 On my side I think we are ready to move to phase 6 🎉 |
Thank you, Giulia @semanticnoodles. Hello Igor @rogorido and Nabeel @nabsiddiqui, The Managing Editor Alex @hawc2 will now read the lesson to confirm if it should be moved onwards to our Phase 6 (Sustainability and Accessibility checks, beginning with copyediting), or if he'd like to suggest any final revisions. Best, |
Hello Igor @rogorido, Nabeel @nabsiddiqui, and Giulia @semanticnoodles, Thank you for your thoughtful consideration of the lesson title. I've now updated the title across all the files: the lesson's new slug is now |
@charlottejmc, @anisa-hawes @semanticnoodles thanks! |
@rogorido and @nabsiddiqui this looks like a solid and well developed lesson with a clear scope and utility for those looking to learn how to present their research with R. My only request for further revision pertains to our prior discussion about the title and what kind of data this lesson shows the reader how to analyze and present. The concept of "mixed data" doesn't get discussed currently in the lesson, so the title will raise a basic question for the reader as to what that means. I must admit I'm not quite sure myself what "mixed data" refers to, so I do think a couple additional paragraphs of explanation about your dataset, at a high level, would help contextualize the following parts of the lesson. The core questions I think need further response: How does this lesson show visualization techniques specifically useful for "mixed data? What is it about "mixed data" that is particularly complex, necessitating different measures for presentation than required for less mixed data? What types of data is this mixed dataset a mixture of, exactly? The opening section of the lesson jams together three different subsections (Introduction, Lesson goals, and Data). My recommendation would be to break those into three separate sections with headings provided for each, and to take more time to introduce the type of data (and specific sample dataset) at the core of your lesson on presenting data visualizations. It is worth going through the lesson as a whole with a mind to this question, as it would be nice to see you bring up the concept of mixed data (or the data central to this lesson) again during the central and concluding sections. Once you make revisions to address this issue, I'll do a brief line edit, and assuming I don't have any remaining questions, I'll send it on to copyedits and preparation for publication. Please let me know if you have any questions! |
@hawc2 Thanks for your comments. We will work on them in the next days! |
Thank you, Igor @rogorido. Remember that the lesson title + filename have been adjusted so the key links are slightly changed. You'll find the Markdown file here: /en/drafts/originals/exploring-visualizing-mixed-data-r-ggplot2.md Please let Charlotte or I know if there's anything we can help with 🙂 |
@rogorido As an alternate title, I'd recommend changing it to: "Visualizing Urban and Demographic Data in R with ggplot2." I'm just not sold on the idea of 'mixed data,' and from what @nabsiddiqui said to my prior comment, the mixed data is a mixture of Urban and Demographic data, so why not just say that? It seems to be giving yourselves unnecessary work to try to explain "mixed data," which I fear doesn't have any legit research or technical precedent for you to lean on. Regardless of what title you pick, the lesson itself will need to be revised to explain the data you're using in more detail, and guide the reader into the lesson with introductory steps in the first few sections. |
Programming Historian in English has received a proposal for a lesson, 'Visualizing data with R and ggplot2,' by @rogorido and @nabsiddiqui.
I have circulated this proposal for feedback within the English team. We have considered this proposal for:
We are pleased to have invited @rogorido and @nabsiddiqui to develop this Proposal into a Submission under the guidance of @semanticnoodles as editor.
The Submission package should include:
We ask @rogorido and @nabsiddiqui to share their Submission package with our Publishing team by email, copying in @semanticnoodles.
We've agreed a submission date of April. We ask @rogorido and @nabsiddiqui to contact us if they need to revise this deadline.
When the Submission package is received, our Publishing team will process the new lesson materials, and prepare a Preview of the initial draft. They will post a comment in this Issue to provide the locations of all key files, as well as a link to the Preview where contributors can read the lesson as the draft progresses.
If we have not received the Submission package by April, @semanticnoodles will attempt to contact @rogorido and @nabsiddiqui. If we do not receive any update, this Issue will be closed.
Our dedicated Ombudspersons are Ian Milligan (English), Silvia Gutiérrez De la Torre (español), Hélène Huet (français), and Luis Ferla (português) Please feel free to contact them at any time if you have concerns that you would like addressed by an impartial observer. Contacting the ombudspersons will have no impact on the outcome of any peer review.
The text was updated successfully, but these errors were encountered: