综述

本站通过个人学习总结和收集各种学习资源,强化自身学习,同时分享给大家优质的学习资源~

首先,将解释DS (Data Science), BA (Business Analytics)的主要区别,同时补充DE (Data Engineer),DA (Data Analyst)。

介绍

DS ,全称 Data Science ,从广义上来说,数据科学顾名思义,和数据有关的科学研究都是数据科学。

维基百科对 DS 的解释是这样的: “ In general terms , Data Science is the extraction of knowledge from data , which is a continuation of the field data mining and predictive analytics , also known as knowledge discovery and data mining .” 具体来说,数据科学是指通过挖掘数据、处理数据、分析数据,从而获取数据中潜在的信息和技术。

BA ,全称 Business Analytics ,是以商业知识为基础,数理编程为手段,从数据分析出发,以决策优化来创造价值的新兴专业,实现 Big Data 的商业应用。

麻省理工 Sloan 商学院对于 BA 项目的定位是这样的: “ Prepares students for careers that apply and manage modern data science to solve critical business challenges .” 总结来说,就是通过对现代量化数据的管理和分析,从而对企业决策做出贡献。

DS 与 BA 的渊源

DS 与 BA 两个新兴专业的发展,源自于大数据时代市场对于数据分析的需求:Statistics 与 CS 专业缺乏对商业的了解,无法满足市场的需求, BA 与 DS 便由此发展起来。

学科基础

DS 是以计算机科学为基础,进而演变而来,其学科基础与 BA 不同,包括了工程学、计算机工程和计算机科学, DS 涉及到的专业知识还包含了 Machine Learning / Cloud Computing / Optimization 等。

BA 是从 M.S. in Statistics 下的 Applied Statistics 分支发展而来,其学科基础是统计学,同时也包含有 Data Mining 和 Regression Model 的运用。

发展时间

在 2011 年至 2013 年间, Statistics 专业与 CS 专业都十分火热,因为学习这两个专业的同学可以处理大量数据,并且拥有很强的数据分析能力,不过可惜的是,这些同学对于 Business 和 Marketing 缺乏了解,数据分析的结果对于现代企业并无太大收益。于是乎,在 2013 年后, Business Analytics , Data Science 专业陆续在各大院校开设。
虽然 DS 同 BA 的渊源很深,但二者之间也有很多的不同点。

学科内容设置及申请背景

Data Science = 30% Statistics + 50% Computer Science + 20% Application
DS 专业则对申请人背景要求较高,适合于理工科背景的同学申请,有一定编程基础的同学也可以申请,量化背景较强的商科专业,比如金工,同样也适合于申请 DS 专业。

MS Business Analytics = 40% Statistics + 30% Computer Science + 30% Business
BA 对于申请人的背景要求没有过多限制,文科、商科、理工科背景的同学都可以申请。

课程设置

DS 专业的课程设置更注重于数学、统计学以及计算机科学的融合,更侧重于培养学生利用计算机进行数据的解读分析。
基础课程包含:统计理论、线性代数、分析算法、数据库系统等。
拓展课程包含:信息科学、人工智能、机器学习等。

而 BA 的课程设置包含了统计学、计算机以及商业三门学科的融合,意在数理编程和管理科学中平衡。
基础课程包含:统计学、数据分析、数据可视化、商业决策等。
拓展课程包含:数据库、数据挖掘等。

就业方向

DS 的就业方向包括 Data Scientist 、 Data Engineer 、 Data Analyst 等,就业面非常广,主要的工作内容包含数据模型的建立、数据架构、数据监管与存储等,目的是为了将数据整理好,使其存储成本最小化,查询的效率更高。麦肯锡研究预测,到 2018 年,仅美国就将面临 14 万到 19 万数据分析人才的空缺,以及 150 万能够通过大数据分析做出决策和管理的人才需求,就业市场需求很大。

从就业数据来看, Data Science 在美国更容易找到工作,并且 DS 项目一般设在工程学院下,属于 STEM 项目,在 OPT 期限长度和工作签证方面都受到政府的青睐,加上偏技术的工作对语言交流的要求也不是很高。但是Business Analytics 的优势是在回国后的就业面更广,可以去技术岗,也可以做咨询或市场,去 VC / PE 的也不少,能力更加多样化,回国发展的同学占大多数。

BA 的就业方向主要在投行、四大、咨询、科技公司等,在不同行业中专门从事行业数据搜集、整理、分析,并通过数据对相应行业进行调研,不同行业的叫法也不同,咨询师、数据分析师、统计分析师等。相关工作岗位的大势所趋已经不容分说,就业前景非常广阔。

在大数据时代,很多行业都需要擅长挖掘和分析数据的人才,例如IT 、互联网、咨询、通信、金融、医药、零售等,因此很多 BA 专业的毕业生都很抢手,薪资待遇也很不错。

GFAA要求

Google

Data Scientist, Engineering

RESPONSIBILITIES
  • Work with large, complex data sets. Solve difficult, non-routine analysis problems, applying advanced analytical methods as needed. Conduct analysis that includes data gathering and requirements specification, processing, analysis, ongoing deliverables, and presentations.
  • Build and prototype analysis pipelines iteratively to provide insights at scale. Develop comprehensive knowledge of Google data structures and metrics, advocating for changes where needed for product development.
  • Interact cross-functionally, making business recommendations (e.g., cost-benefit, forecasting, experiment analysis) with effective presentations of findings at multiple levels of stakeholders through visual displays of quantitative information.
  • Research and develop analysis, forecasting, and optimization methods to improve the quality of Google's user facing products.
Minimum qualifications:
  • Master's degree in a quantitative discipline (e.g., Statistics, Operations Research, Bioinformatics, Economics, Computational Biology, Computer Science, Mathematics, Physics, Electrical Engineering, Industrial Engineering) or equivalent practical experience.
  • 2 years of work experience in data analysis related field.
  • Experience with statistical software (e.g., R, Python, MATLAB, pandas) and database languages (e.g., SQL)
Preferred qualifications:
  • PhD degree in a quantitative discipline.
  • 4 years of relevant work experience, including expertise with statistical data analysis such as linear models, multivariate analysis, stochastic models, sampling methods.
  • Applied experience with machine learning on large datasets.
  • Experience articulating and translating business questions and using statistical techniques to arrive at an answer using available data.
  • Demonstrated leadership and self-direction. Willingness to both teach others and learn new techniques.
  • Demonstrated skills in selecting the right statistical tools given a data analysis problem. Effective written and verbal communication skills.

Business Intelligence Analyst, Google People Services

RESPONSIBILITIES
  • Structure and perform independent analyses, and package findings to help inform organizational decision-making.
  • Partner with Operations leadership to define and analyze key success metrics, measure efficiency, and inform the best possible business decisions.
  • Participate in strategy development and implementation of processes and tools (including machine learning) to allow People - Services Operations to scale effectively as Google grows.
  • Serve as a subject matter expert on the Google People Services Operations and People Operations process landscape, and identify opportunities to improve processes where needed.
Minimum qualifications:
  • Bachelor's degree in Mathematics, Business Administration, Computer Science, Finance, Statistics, related field or equivalent practical experience.
  • 4 years of industry experience.
  • Programming experience (R, Python, SQL, etc.).
  • Experience in management consulting or business strategy, with work experience as an analyst or in an analytical role.
Preferred qualifications:
  • Excellent analytical and problem-solving skills, with capability to process large amounts of data to drive business strategies and decisions.
  • Excellent communications skills.
  • Ability to self-start and self-direct work in an unstructured and fast-paced environment.
  • Ability to work comfortably in ambiguous situations.

Facebook

Data Scientist Infrastructure

RESPONSIBILITIES
  • Leverage data and business principles to create and drive large scale FB Data Center programs
  • Define and develop the program for metrics creation, data collection, modeling, and reporting the operational performance of Facebook’s data centers
  • Work cross-functionally to define problem statements, collect data, build analytical models and make recommendations
  • Be a self-starter, motivated by a passion for developing the best possible solutions to problems
  • Identify and implement streamlined processes for data reporting and communication
  • Use analytical models to identify insights that are used to drive key decisions across the organization
  • Routinely communicate metrics, trends and other key indicators to leadership
  • Provide leadership and mentorship to other members of the team
  • Lead and support various ad hoc projects, as needed, in support of Facebook’s Data Center strategy
  • Build and maintain data driven optimization models, experiments, forecasting algorithms and capacity constraint models
  • Leverage tools like R, Tableau, PHP, Python, Hadoop & SQL to drive efficient analytics
MINIMUM QUALIFICATIONS
  • Degree in an analytical field (e.g. Computer Science, Engineering, Mathematics, Statistics, Operations Research, Management Science)
  • 3+ years of experience in a role with data analysis and metrics development
  • 3+ years of hands-on experience analyzing and interpreting data, drawing conclusions, defining recommended actions, and reporting results across stakeholders
  • 3+ years of SQL development experience writing queries
  • 3+ years of hands-on project management experience
  • 3+ years of experience with data visualization tools
  • 3+ years of experience with packages such as R, Tableau, SPSS, SAS, STATA, etc.
  • 2+ years of experience with scripting in Python or PHP
  • Experience leveraging data driven models to drive business decisions
  • Experience using data access tools and building visualizations using large datasets and multiple data sources
  • Experience thinking analytically
  • Experience communicating data to all organizational levels
  • Experienced with packages such as NumPy, SciPy, pandas, scikit-learn, dplyr, ggplot2
  • Knowledge of statistics and optimization techniques
  • Hands-on experience with medium to large datasets (i.e. data extraction, cleaning, analysis and presentation)
PREFERRED QUALIFICATIONS
  • Technical knowledge of data center operations
  • 2 plus years of industry or graduate research experience.
  • Hands-on experience with datasets (i.e. data extraction, cleaning, analysis and presentation).
  • Demonstrated ability to provide insights from data sets.
  • Understanding of statistics and optimization techniques.

Business Analyst, Global Business Marketing

RESPONSIBILITIES
  • Develop a deep understanding of existing business using data to build insights about our advertisers, products and competitors
  • Create, maintain, and improve key data sets, pipelines and reporting to track and manage Facebook Global Business Marketing's activities
  • Develop new and iterate on existing dashboards and reporting tools to automate reporting
  • Drive consistency in execution to maintain and improve reporting and analytical quality
  • Work with Digital Marketing stakeholders to provide analytical support for ongoing needs and large initiatives
  • Effectively communicate complex analytical concepts to non-technical stakeholders to drive data driven decision making
  • Provide business analytic strength to help drive initiatives critical to ongoing growth
  • Conduct insightful analysis using internal and external data (e.g. revenue, product, market, industry) to derive insights that will drive business decisions
  • Focus on process and continuous improvement of core projects through automation and process enhancement
  • Analyze current channel, vertical, and sub-vertical performance and identify levers for driving revenue growth
  • Drive operational excellence in the marketing organization through identification and execution of opportunity areas that create efficiency, remove obstacles, or create improved processes and approaches to the business
  • Ability to synthesize impact on financials/economics on our business and provide business analytic strength to help drive initiatives critical to ongoing growth
  • Project manage improvements to drive higher efficiencies in day-to-day operations using a data-driven approach
MINIMUM QUALIFICATIONS
  • BA/BS in Engineering, Statistics/Math, Business or related field
  • 5+ years of work experience in Marketing or Web Analytics, or 3+ years experience with MBA/Master's degree
  • 2+ years of experience in SQL
  • 2+ years of statistical experience (creating attribution models, A/B tests or lift analysis)
  • Experience with a scripting/templating language (ex. Python, JavaScript, JINJA)
  • Demonstrated experience problem solving and providing business insights and recommendations from data
  • Demonstrated experience in presenting technical content across different stakeholder audiences
  • Excel and Tableau experience
PREFERRED QUALIFICATIONS
  • MBA or graduate degree in a quantitative field
  • Ability to manage multiple concurrent projects and drive initiatives in a cross-functional environment
  • Experience or familiarity with CRM platforms (ex. Salesforce, Oracle CRM, SAP CRM)
  • Experience with online advertising
  • Knowledge of other programming languages
  • Experience working with or in support of diverse communities

Apple

Data Scientist - Applied Machine Learning

Key Qualifications
  • Strong background in machine learning and statistical modeling.
  • Strong experience with time series modeling, forecasting, anomaly detection, user profiling and segmentation, behavioral targeting etc.
  • Strong coding skills and experience in Python based on state-of-the-art machine learning and neural network methodologies (e.g., TensorFlow, PyTorch) for training and serving.
  • Hands-on experience with big data systems (e.g., MapReduce, Spark) with TB to PB scale datasets.
  • Passion for applying advanced methods, and innovating approaches at the intersection of machine learning, optimization, and computer science.
  • Proven track record of formulating business problems into concrete mathematical framework, and translating analytic results into actionable business recommendations.
  • Self-learner and has a thirst of continuing learning with a passion for work, attention to detail, and a can-do attitude.
Education & Experience
  • PhD degree in Computer Science, Statistics, Operations Research, Mathematics or related field.

Amazon

Data Scientist

MINIMUM QUALIFICATIONS
  • M.S. (or equivalent) in Computer Science, Statistics, Math, Engineering, or related fields; and/or relevant industry experience
  • 2+ years of quantitative experience in Logistics/Supply Chain, Transportation, Engineering or related Businesses
  • 2+ years of experience with one or more programming languages (e.g. Python, Java, C++, C#, Ruby)
  • 2+ years of experience in machine-learning packages (e.g. supervised and unsupervised learning, clustering, random forests, etc...) and/or statistical analysis tools (e.g.: regression analysis, hypothesis testing, time series analysis, etc...)
  • 2+ years of experience with data processing technologies: AWS technologies, SQL, data pipelines, etc...
  • 2+ years of experience with large-scale data: extracting, processing, analyzing, and representing large quantities of data (e.g.: millions to billions of records)
PREFERRED QUALIFICATIONS
  • Ph.D. in Computer Science, Statistics, Math, Engineering, or related fields
  • 4+ years of quantitative experience in Logistics/Supply Chain, Transportation, Engineering or related Businesses
  • 4+ years of experience with one or more programming languages (e.g. Python, Java, C++, C#, Ruby)
  • 4+ years of experience in machine-learning packages (e.g. supervised and unsupervised learning, clustering, random forests, etc...) and/or statistical analysis tools (e.g.: regression analysis, hypothesis testing, time series analysis, etc...)
  • 4+ years of experience with data processing technologies: AWS technologies, SQL, data pipelines, etc...
  • 4+ years of experience with large-scale data: extracting, processing, analyzing, and representing large quantities of data (e.g.: millions to billions of records)

Business Intelligence Engineer

MINIMUM QUALIFICATIONS
  • Bachelor's degree in Computer Science, Engineering, Math, Finance, Statistics or related discipline
  • 3+ years' experience as a Business Intelligence Engineer or Data / Business Analyst
  • Proficiency with data querying or modeling technique with SQL
  • Experience with automated self-service reporting tools
  • Confidence in dealing with technical and non-technical senior level staff on
  • Ability to operate successfully and independently in a fast-paced environment
  • Strong critical thinking and analytical skills to drive clarity on ambiguous problems
  • Self-motivated with critical attention to detail, deadlines, and reporting
PREFERRED QUALIFICATIONS
  • Master’s degree in Computer Science, Engineering, Math, Finance, Statistics or related discipline
  • 5+ years’ experience with 2+ years Analytics and Process Improvement-related initiatives
  • Experience working with cloud data repository services such as AWS redshift
  • Knowledge and direct experience using business intelligence reporting tools

DA & DE 补充

Data Analyst

DA 的概念位于DS与BA之间,从技术发展的角度上讲,DA无需具备DeepLearning,一般是在BA的基础上,编程能力较强,能熟练掌握Python或R,利用Numpy,Pandas,Matplotlib,Seaplot等(R 使用ggplot)对数据进行分析和可视化展示。如果具备网络爬虫获取数据的能力或利用Machine Learning进行分类回归等预测,是最好不过了!

Data Engineer

Data Engineer主要是设计,开发和部署大型数据服务和平台。一般需要熟练掌握Java,Python,Scala等编程语言,同时还要掌握Spark、Hadoop等大数据技术。

【简书】数据工程师成长指南

Click

就业总结

薪资

DS ($117,345/yr)> DE ($116,591/yr) > DA ($72,046/yr) > BA ($69,163/yr) -- Data from Glassdoor

工作申请难度及数量

技术要求

Title Stats Python/R SQL A/B Test Experiment Design Algorithm ML Case Big Data Visualization Title
DS (Engineer) !! !! DS (Engineer)
DS (Analytics) ? !! !! ? DS (Analytics)
DA ? ? !! DA
BA ? !! !! BA

DS

  • Java、Python、Swift(Algorithm)
  • 数据分析(Numpy、Pandas、Matplotlib)
  • Machine Learning (sklearn)
  • Deep Learning (Tensorflow、Pytorch)
  • Big Data(Hadoop、Spark、Flink)
  • NLP、CV
  • SQL \ Stats \ A/B Test

DE

  • Java、Scala、Python
  • Big Data(Hadoop、Spark、Flink)

DA

  • Python or R
  • SQL \ Stats \ A/B Test
  • 数据分析(Numpy、Pandas、Matplotlib)or(ggplot)
  • Machine Learning (sklearn)

BA

  • Python or R or Excel
  • SQL
  • 数据分析(Numpy、Pandas、Matplotlib、sklearn)or(ggplot)

基础技术

  • Python
  • SQL
  • Algorith
  • Stats
  • A/B Test
  • Experiment Design
  • Visualization

考虑到SDE的工作数量,以及很多DS要求比较严格,如Ph.D.等,所以准备DS的同时,最好还可以提升SDE或Full Stack能力,提升自身竞争力,同时SDE和Full Stack也能够帮助在DS、DE、DA、BA上提升水平和效果。

以上部分信息来自网络中整理,本人同时也在学习过程中,如部分信息不准确,不全面,欢迎指正交流