Kiran R, Director, Data Sciences & Advanced Analytics at VMware, conducted a webinar with SpringPeople, where he covered all about Data Sciences and it’s career prospective.
In this LIVE Online Knowledge session, Kiran shared his expertise and in-depth understanding of Data Sciences which he has gained from his vast experiences while working in some of the major tech houses like Dell, Amazon, Flipkart and VMware.
Data Sciences: A Perspective
There are many definitions for Data Sciences, all of which are more or less semantic, just like Analytics. Analytics has different meanings to different people. For somebody, adding two numbers, or doing a chart in excel is Analytics, while for some doing a propensity model is Analytics. So definitions are always semantic in nature.
Data Science is exploring Data to handle uncertainty. When there is certainty, there’s no data science. For example, the sales of a particular organization in a year has a certain answer but the revenue of the organization in the upcoming year, has no certain answer. One can estimate but cannot point estimate as point estimate is likely to be wrong. In Data Science, not only the problem but also outcome is also uncertain, which makes Data Science more exciting.
Difference between Big Data & Data Sciences
Big Data is another data source for Data Scientist. Big Data is information that cannot be collected or used via traditional tools. It can neither fit into enterprise data warehouse nor in a laptop. It becomes uneconomical to store the data in the traditional form too.
Thus, Big Data store the information on multiple machines rather than storing it in one machine. Now if anybody wants to access a information called ‘Assets’, which is stored on 100 machines, then you need a distributed file system. These distributed file systems are called Hadoop, which in turn distribute a job onto several machines, which is map. Processing happens in several machines and finally they aggregate the results back.
Data Sciences in B2B vs B2C
In B2C, the data sets are larger than B2B. A B2B company may have half a million customers whereas a B2C company have millions of customers. The digital influence you can create in B2C is very high.
In B2B, the buying cycle is very high. The average selling price in Amazon and Flipkart is 40$, but this is not the same with B2B. For businesses, this average selling price is thousands of dollars or hundreds of thousands.
Career in Data Sciences: Data Scientist & Data Scientist Manager
Some of the most essential skills of a Data Scientist are –
1) Debugging Skills – Learn to install packages, Install open source packages, work with a new OS or technique and cultivate a never give up attitude.
2) Strong Data Analysis skills: Pull data, Manipulating data, Visualize Data
3) Programming skills: Expertise in R or Python or SAS or language that supports statistical packages. “Nothing ever got done in data sciences without programming”
4) Mathematical Aptitude and Understanding of Techniques: Math is a must to explain it to others while doing business.
5) Business Understanding & Convert Business Problem to Data Mining Problem
Watch the entire Webinar right here
Questions and Answers
1.Please elaborate on the machine learning techniques.
Machine Learning Techniques are primarily of three types – Classification, Prediction & Segmentation. Othan than that, there are types like Linear things, which is trying to draw a straight line. Logistic regression, linear regression, regularized logistic regression are some of the examples of Linear Machine Learning Technique. Another important type is Tree based models and Nonlinear techniques
2. I am presently working as a Business Intelligence Professional. How long will it take to become a Data scientist?
There’s no exact time which can transform you into a Data Scientist. It varies from person to person. You are willing to commit 1 hour a day or 8 hours a day, will answer your question. To call yourself a Data Scientist, you need to build minimum 7-8 models and must learn and overcome all the pitfalls.
You can also read How to become a Data Scientist?
3. Can you please brief on Recommendation Engines?
Recommendation Engines basically recommend products to you based on your browsing history. For example, if you are on a product detail page of Flipkart and browsing about cameras. Related products will show you camera accessories or other bigger cameras. Recommendation Engines can be called as classification problem. So, somebody is trying to recommend to you one object out of several other objects, which means the engine is ranking all the objects to show it to you.
4. What’s an Apriori Pattern Mining?
Almost all Data Mining books refers to an example of Beer and Diaper, where somebody goes to a supermarket and buys beer and diaper together. Well, Apriori data mining relies on three things –
- Support – How many times the object were bought together.
- Confidence – If one object is bought, how can we say that the other object was also bought.
Some textbooks definitely talks about this Apriori algorithm but in the industry the application of Apriori is very low. E-commerce houses like Amazon or Flipkart do not use Apriori data mining to come up with recommendations. To learn all these data mining techniques, I would suggest to pick a good book like Machine Learning by Tom Mitchell.
5. For DBA and passionate Data Mining professional, will Data Science be a good foundation?
Well it’s vice versa. Data science cannot be the foundation for DBA, rather knowing DBA can be a foundation for Data Science. Yes, if you are a Database Analyst and you know SEQUEL well, you know inner join, left outer join, right outer join and if you can write short procedures, then your DBA work can be a good foundation for Data Science.
Don’t Miss our Next Webinar on how to Accelerate Your Career With New Skills: APIs, Microservices & More
If you work in technology, you can’t afford to miss this one. Kuntal Vahalia, GM at MuleSoft India and Victor Romero, software architect at MuleSoft investigate why more and more IT execs are turning to API integration strategies and how it can fuel your career growth. Register now as seats are limited: https://goo.gl/Q9ZMNn
Latest posts by Payel Bhowmick
- All You Need to Know About Anaconda Distribution For Python - December 13, 2017
- Data Science and Analytics in the Gaming Industry - December 12, 2017
- Data Science Incorporated? How Companies are Investing into In-House Data Science Teams - December 8, 2017