Data Science at first glance seems to be pretty similar to traditional data analysis or data processing. But Data Science raises the use of data to a new level. Instead of reporting "What happened", Data Science generates measurable added-value purely based on data. Measurable value implies that the impact of the data science effort is immediate, at least one can show a direct dependency of the data science work and the additional generated value.
Not in a sense, that the results are used or are even useful for science. It is science because the way of work is identical to the way of work in science. This means that all results can be reproduced and validated in a transparent manner. There is no room for magic or ad-hoc methods that lack solid foundation. This includes well justified statistical and theoretical foundations, but also reasonable and comprehensible coherences in the given context.
At the end of the day what matters are the decisions made. Sadly, today most decisions aren't even taken at all, they are processed by simple business rules because of the sheer amount. Furthermore, even if the micro-decisions like 'which price to set' or 'how many items to order' are made by humans, they are often really bad in theses micro-decisions. As explained wonderfully in Daniel Kahneman's "Thinking, Fast and Slow" decisions of humans are highly biased and error prone. Data Science has shown to be able to automate these micro-decisions and outperform the quality of human decisions is many cases and fields.
Techniques, algorithms and tools are obviously a very important part in Data Science. Without knowing the details of tools and algorithms Data Science rapidly will exceed all given resource limitations. But due to the measurable results it's more important to correctly implement the statistical techniques. The statistical properties define the quality of the result, not the algorithm to get there.
Nice looking data visualizations or data stories are just the visible tip of the iceberg. Most of Data Science is operating inside the value chains, the production processes or the business processes. It even must be deeply embedded inside these core processes in order to be able to yield the expected value. A successful data science effort cannot be placed on top of these highly protected areas, it needs the courage to change these processes in order to take advantage of the data science endeavor.
Data is the raw material for the Data Scientist. But data in it's nature isn't something anonymous. The data scientist understands the responsibility and handles the data with the appropriate care. This includes protection of privacy but also care against data loss and preserving and increasing the quality of the data. Even actively collecting data and making it available is in his/her own interest as he/she loves data.
As Data Science involves complex algorithms and transformations, other data scientists may not be able to tell right away what a data science model does. In any case, judging whether a data science work is correct and does what it is meant to do is not possible without further investigation. It is therefore crucial for the Data Scientist to be honest about his work and to be as transparent as possible. It is all too easy to fool anyone about the quality and this would harm the whole data science field.
When it comes to resources, data science is greedy in its nature. There is no intrinsic limit of the resources consumption. It is the duty of the Data Scientist to keep the resources footprint at a moderate level. If two algorithms, models or technologies yield similar results, the Data Scientist chooses the one which is more resource saving. Resources includes not only computational resources, but also maintainability and operational complexity.
The Data Scientist seems to be a new role in the IT organization, but there is not need for a new role, there are more than enough silos. The Data Scientist is a software developer and an operator. He stays in steady contact with all departments. As he writes code which is part of the production system and therefore need to fulfill the same quality criteria as all other software in the organization he actively works together with the software developers. Because of the significant computational resource footprint the Data Scientist will bring the Data Science effort only seamlessly in production if he/she cultivates a vital collaboration with the operations department.
Due to its intervening nature, Data Science comes with significant costs and the return of invest is often not trivial to estimate. Therefore it is natural for a Data Scientist to be good at selling his/her work. This includes knowledge of business processes and business economics as well as marketing and sales talent.
The Data Scientist has a scientific role. As a scientist, the Data Scientist is curious about the domain that is studied. The data scientist is free to come up with models, assumptions and theories, but applies sceptical reasoning and canonical knowledge to judge all of these, looking at their evidence and plausibility. Using cross-validation and testing are necessary steps for the Data Scientists, but are not the only aspects the Data Scientist relies on to evaluate the model.
Knowing "Occam's Razor", the Data Scientist is aware of the trade off between complex models and simple models, prefering simpler, more general models to a complicated specific one, provided that their prediction qualities are both acceptable.