Machine Learning & Data Science Skills You Need To Get Hired In Fortune 500 Companies

We researched the necessary/must-have and desirable/nice-to-have skills Fortune 500 companies look for when hiring engineers to work on solutions requiring expertise in Machine Learning, Data Science, Big Data etc. Our motive is to keep this list updated so it’s relevant.

[Here’s a report on how much Data Scientists get paid in US in 2016: Data Scientist Salary Insights 2016]

Here are some:

  1. Proficient in querying and manipulating large data sets for analytical purposes using SQL-like languages (Hive / Impala)
  2. Apache Ecosystem – Hadoop, Hadoop File System (HDFS), MapReduce/YARN (Yet Another Resource Negotiator), Hive (Data warehouse infrastructure), HBase (Distributed Column-oriented NoSQL Database), Oozie Workflow, Sqoop Data Ingestion, Zookeeper, Pig Scripting, Ambari (Hadoop Clusters Management Platform), Spark (Big Data Processing Engine), Flink (Streaming dataflow / analytics engine), Storm (Real-time data processing), Flume (Log data processing), Avro (Data serialization)
  3. Machine learning techniques such as Neural networks, Hidden Markov Model (HMM), Maximum entropy models and other popular algorithms
  4. Feature engineering and statistical modeling methods such as Conditional Random Field (CRF), HMM, Support Vector Machine (SVM), Gradient Boosting Decision Tree(GBDT) etc.
  5. Statistical methods such as Categorical Data Analysis, Multivariate Analysis, Regression Analysis, Survey Sampling Design, Survival/Reliability analysis, Design of experiments, Analysis of variance.
  6. Building machine learning systems for modern parallel-computing environments (GPU, Multicore Symmetric Multiprocessing (SMP), Distributed Clusters); CUDA kernels
  7. Machine learning frameworks such as Caffe, Theano, Torch, TensorFlow, MXNet, Apache Mahout, Spark MLlibscikit-learn, scipy, numpy; Amazon Machine Learning
  8. Convolutional Neural Networks (CNN), Recurrent Neural Network(RNN), Supervised and Unsupervised learning, and optimization techniques
  9. Traditional/Modern statistical techniques, including SVM, Regularization, Boosting, Random Forests, and other Ensemble Methods
  10. Natural language processing(NLP) problems, including predictive typing, input method conversion, tokenization, tagging, language modeling, language identification, sentiment analysis, named entity recognition, lemmatization, summarization
  11. Building solutions for spell corrections, related searches, synonym/acronym expansions, query rewrites, metrics accumulation, spam prevention, ranking, and recommendations
  12. Proficiency in predictive modeling and data mining tools such as SQL, R, SAS, JMP, Python, Watson, and Aster
  13. Experience with reporting/analytics/data visualization tools such as D3.js, Tableau, Qlikview, Datameer, Platfora, ELK and Cognos etc.
  14. Familiarity with commercial ETL platforms like Informatica, SSIS, Talend, etc
Machine Learning and Data Science Skills
Machine Learning and Data Science Skills


Copy Protected by Chetan's WP-Copyprotect.