Data science

Data science , also known as data-driven science , is an interdisciplinary field on scientific methods, processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, [1] [2] similar to data mining .

Data science is a “concept to unify statistics, data analysis and their related methods” in order to “understand and analyze actual phenomena” with data. [3] It employees technical and theories drawn from Many fields dans le broad areas of mathematics , statistics , information science , and computer science , In Particular from the subdomains of machine learning , classification , cluster analysis , data mining , databases , and visualization .

Turing award winner Jim Gray is the fourth paradigm of science ( empirical , theoretical , computational and now data-driven) and asserted that “everything about science is changing because of the impact of information technology” and the deluge data . [4] [5]

When Harvard Business Review is called “The Sexiest Job of the 21st Century” [6] the term becomes a buzzword , and is now often applied to business analytics , [7] or even arbitrary use of data, or used as a sexed-up Term for statistics. [8] While many university programs offer a science degree, there exists no consensus on a definition or curriculum. [7] Because of the current popularity of this term, there are many “advocacy efforts” surrounding it. [9]


Data science process flowchart from “Doing Data Science”, Cathy O’Neil and Rachel Schutt, 2013

The term “data science” (originally used interchangeably with ” datalogy “) HAS Existed for over thirty years and INITIALLY Was used as a substitute for computer science by Peter Naurin 1960. In 1974 Naur published Concise Survey of Computer Methods , qui freely The use of the term data science in its survey of the data processing methods that are used in a wide range of applications.

In 1996, members of the International Federation of Classification Societies (IFCS) met in Kobe for their biennial conference. Here, for the first time, the term data science is included in the title of the conference (“Data Science, classification, and related methods”), [10] Chikio Hayashi. [3]

In November 1997, CF Jeff Wu gave the inaugural lecture entitled “Statistics = Data Science?” [11] for his appointment to the HC Carver Professorship at the University of Michigan . [12]In this reading, he is a statistical worker with a trilogy of data collection, data modeling and analysis, and decision making. In his conclusion, he initiated the modern, non-computer science, use of the term “data science” and advocated that statistical data scientists. [11] Later, he presented his lecture entitled “Statistics = Data Science?” PC Mahalanobis Memorial Readings.

In 2001, William S. Cleveland, University of Chicago Press, University of Chicago Press, Chicago, IL, ETATS-UNIS Department of Statistics, “Which was published in Volume 69, No. 1, of the April 2001 edition of the International Statistical Review. [14] In his report, Cleveland establishes six technical areas which he believed to encompass the field of data science: multidisciplinary investigations, models and methods for data, computing with data, pedagogy, tool evaluation, and theory.

In April 2002 the International Council for Science: Committee on Data for Science and Technology (CODATA) [15] started the Data Science Journal , [16] he publication Focused on issues Such As the description of data systems, Their publication on the internet , Applications and legal issues. [17] Shortly thereafter, in January 2003, Columbia University began publishing The Journal of Data Science , [18] which provides a platform for all data workers to present their views and exchange ideas. The paper was largely devoted to the application of statistical methods and quantitative research. In 2005, The National Science Board published “Long-lived Digital Data Collections:

In the 2012 Harvard Business Review article “Data Scientist: The Sexiest Job of the 21st Century”, [6] DJ Patil claims to have wedged this term in 2008 with Jeff Hammerbacher to define their jobs at LinkedIn and Facebook, respectively. He is a scientist and a scientist at the University of New York.

In 2013, the IEEE Task Force on Data Science and Advanced Analytics [20] Was lancé, and the first international conference: IEEE International Conference on Data Science and Advanced Analytics Was lancé in 2014. [21] In 2014, the American Statistical Association Section On Statistical Learning and Data Mining: The Statistical Analysis and Data Mining: The ASA Data Science Journal. [9] In 2015, the International Journal on Data Science and Analytics [22] was launched by Springer to publish original work on data science and big data analytics. In 2013, the first ”


ALTHOUGH use of the term “data science” has exploded in business environments, Many academics and journalists see no distinction entre science data and statistics . Writing in Forbes , Gil Press Argues That data science is a buzzword without a clear definition and simply HAS REPLACED ” business analytics ” in contexts Such As graduate degree programs. [7] In the question-and-answer section of his keynote address at the Joint Statistical Meetings of the American Statistical Association , noted statistician Nate Silver said, “I think data-scientist is a sexed up term for a statistician …. Statistics is a branch of science.

Data scientist

Data scientists use their data and analytical ability to find and interpret rich data sources; Manage large amounts of data and hardware, software, and bandwidth constraints; Merge data sources; Ensure consistency of datasets; Create visualizations to aid in understanding data; Build mathematical models using the data; And present and communicate the data insights / findings. They are often expected to produce as soon as possible, and the results are as fast as possible. [23]

“Data Scientist” has become a popular occupation with Harvard Business Review dubbing it “The Sexiest Job of the 21st Century” [6] and McKinsey & Company projecting a global excess demand of 1.5 million new data scientists. [24] Universities are offering masters courses in data science. [25] Shorter private bootcamps are also offering data science certificates including student-paid programs like General Assembly to employ-paid programs like The Data Incubator . [26]


Main article: List of statistical packages

In the 2010-2011 time frame, data science software has an inflection point where open source software started supplanting proprietary software. [27] The use of open source software allows modifying and extending the software, and it allows sharing of the resulting algorithms. [28] [29] [30]


  1. Jump up^ Dhar, V. (2013). “Data science and prediction” . Communications of the ACM . 56 (12): 64. doi : 10.1145 / 2500499 .
  2. Jump up^ Jeff Leek (2013-12-12). “The key word in” Data Science “is not Data, it is Science” . Simply Statistics.
  3. ^ Jump up to:a b Hayashi, Chikio (1998-01-01). “What is Data Science? Fundamental Concepts and a Heuristic Example” . In Hayashi, Chikio; Yajima, Keiji; Bock, Hans-Hermann; Ohsumi, Noboru; Tanaka, Yutaka; Baba, Yasumasa. Data Science, Classification, and Related Methods . Studies in Classification, Data Analysis, and Knowledge Organization. Springer Japan. pp. 40-51. ISBN  9784431702085 . Doi : 10.1007 / 978-4-431-65950-1_3 .
  4. Jump up^ Stewart Tansley; Kristin Michele Tolle (2009). The Fourth Paradigm: Data-intensive Scientific Discovery . Microsoft Research. ISBN  978-0-9825442-0-4 .
  5. Jump up^ Bell, G .; Hey, T .; Szalay, A. (2009). “COMPUTER SCIENCE: Beyond the Data Deluge”. Science . 323 (5919): 1297-1298. ISSN  0036-8075 . Doi : 10.1126 / science.1170411 .
  6. ^ Jump up to:a b c Davenport, Thomas H .; Patil, DJ (Oct 2012), Data Scientist: The Sexiest Job of the 21st Century , Harvard Business Review
  7. ^ Jump up to:a b c “Data Science: What’s The Half-Life Of A Buzzword?” . Forbes . 2013-08-19.
  8. ^ Jump up to:a b “Nate Silver: What I need from statisticians” . Statistics Views . 23 Aug 2013.
  9. ^ Jump up to:a b Talley, Jill (2016-06-01). “ASA Expands Scope, Outreach to Foster Growth, Collaboration in Data Science” . AMSTATNEWS . American Statistical Association . Retrieved 2017-02-04 .
  10. Jump up^ Press, Gil. “A Very Short History of Data Science” .
  11. ^ Jump up to:a b Wu, CFJ (1997). “Statistics = Data Science?” (PDF) . Retrieved 9 October 2014 .
  12. Jump up^ Identity of statistics in science examined . The University Records, 9 November 1997, The University of Michigan . Retrieved 12 August 2013 .
  13. Jump up^ “PC Mahalanobis Memorial Lectures, 7th series” . PC Mahalanobis Memorial Lectures, Indian Statistical Institute. Archived from the originalon 26 Feb 2017 . Retrieved 18 Jul 2017 .
  14. Jump up^ Cleveland, WS (2001). Data science: an action plan for the field of statistics. International Statistical Review / International Journal of Statistics, 21-26
  15. Jump up^ International Council for Science: Committee on Data for Science and Technology. (2012, April). CODATA, The Committee on Data for Science and Technology. Retrieved from International Council for Science:
  16. Jump up^ Data Science Journal. (2012, April). Available Volumes. Retrieved from Japan Science and Technology Information Aggregator, Electronic:
  17. Jump up^ Data Science Journal. (2002, April). Contents of Volume 1, Issue 1, April 2002. Retrieved from Japan Science and Technology Information Aggregator, Electronic:
  18. Jump up^ The Journal of Data Science. (2003, January). Contents of Volume 1, Issue 1, January 2003. Retrieved from
  19. Jump up^ National Science Board. “Long-Lived Digital Data Collections Enabling Research and Education in the 21st Century” . National Science Foundation . Retrieved 30 June 2013 .
  20. Jump up^ “IEEE Task Force on Data Science and Advanced Analytics” .
  21. Jump up^ “2014 IEEE International Conference on Data Science and Advanced Analytics” .
  22. Jump up^ “Journal on Data Science and Analytics” .
  23. Jump up^ Nguyen, Thomson. “Data scientists vs. data analysts: Why the distinction matters” . Retrieved 2 October 2015 .
  24. Jump up^ “Big data: The next frontier for innovation, competition, and productivity” .
  25. Jump up^ “Big Data Analytics Masters” . Information Week . Retrieved 2016-02-22 .
  26. Jump up^ “NY gets new bootcamp for data scientists: It’s free, but harder to get into Harvard . ” Venture Beat . Retrieved 2016-02-22 .
  27. Jump up^ Chalef, Daniel (2016-03-20). “Data Science Tools – Are Proprietary Vendors Still Relevant?” . . Retrieved 2016-11-07 .
  28. Jump up^ Asay, Matt. “For data scientists, the big money is in open source” . TechRepublic . Retrieved 6 November 2016 .
  29. Jump up^ Jones, Mr. Tim. “Data science and open source” . IBM DeveloperWorks . IBM . Retrieved 6 November 2016 .
  30. Jump up^ Talbert, Neera. “Open Source Software Fuels a Revolution in Data Science” . InsideBIGDATA . Retrieved 6 November 2016 .

Leave a Comment

Your email address will not be published. Required fields are marked *