Top 25 Python Questions & Answers for Data Science Interviews

47424 1

Over the years, Python has become the most popular & widely adopted programing language for data science applications. Its user-friendly nature and lightweight implementation make it a great choice for data scientists. It can be easily used for data analysis where we need to integrate the results into web apps or to add mathematical models or codes for production.

So in order to succeed in interviews for data science roles, it is important to have a clear idea about the kind of questions to expect. To help you breeze past your interview I have compiled a list of Python Data Science questions along with their model answers that you are most likely to face in your interview. Do let us know if it helped you in your interview by providing your feedback at the comment section below.

  1. Write the output for the given Python code –

def multipliers ():

return [lambda x: i * x for i in range (4)]

print [m (2) for min multipliers ()]

The output for the above code will be [6,6,6,6]. Due to late binding, the value of the variable ‘i’ is looked up when any of the functions returned by multipliers are called.

  1. Can the lambda forms in Python contain statements?

No, the lambda forms in Python cannot contain statements as their syntax is restricted to single expressions and they are used for creating function objects which are returned at runtime.

3. Which library is used for plotting in Python language?

The answer to this question varies based on the requirements for plotting data.

Matplotlib is the library used for plotting in Python language, but it needs a lot of fine-tuning to ensure that the plots look shiny. Data scientists prefer Seaborn to create statistically and aesthetically appealing meaningful plots.

  1. How can you check if a dataset or time series is Random?

We make use of the lag plot. If the lag plot for the given dataset does not show any structure then it is random.

Check Course Details of Our Python Training Here

  1.  What is pylab?

Pylab is a package that combines NumPy, SciPy, and Matplotlib into a single namespace.

  1. State the difference between tuples and lists in Python.

Tuples can be used a key in a dictionary to store notes on locations whereas a list can be used to store multiple locations. Lists are mutable whereas tuples are immutable which means they cannot be edited.

  1. Name a few libraries in Python used for Data Analysis and Scientific Computations.

NumPy, SciPy, Seaborn, Pandas, Matplotlib, SciKit are a few libraries in Python used for  Data Analysis and Scientific Computations.

  1.  Write the code to sort an array in NumPy by the (n-1)th column?

This can be achieved using argsort () function. Let us take an array X then to sort the (n-1)th column the code will be x[x [: n-2].argsort ()]

  1. If you are to give the first and last names of employees, which data type in Python will you use to store them?

You can use a list that has a first name and last name included in an element or uses Dictionary.

  1.  Explain the usage of decorators.

A decorator is a function that takes another function and extends the behavior of the latter function without explicitly modifying it. They are used to modify the code in classes and functions. With the help of decorators, a piece of code can be executed before or after the execution of the original code.

  1.  What will be the output of the below code:

def foo (i= []):

i.append (1)

return i

>>> foo ()

>>> foo ()

The output for the above code will be-


[1, 1]

The argument to the function foo is evaluated only once when the function is defined. However, since it is a list, at every step the entire list is modified by appending a 1 to it.

  1. Which tool in Python would you use to find bugs?

The tools to find bugs in Python are Pylint and pychecker. Pylint is used to verify if a module satisfies all the coding standards. Pychecker is a static analysis tool that helps to find out bugs in the source code.

  1. What is the difference between range () and xrange () functions in Python?

The range () function returns a list whereas the xrange () function returns an object that works like an iterator for generating numbers on demand.

  1. How can you randomize the items of a list in place in Python?

We can use the Shuffle (lst) function for randomizing the items of a list in Python.

  1. What is PEP8?

PEP8 is a set of coding guidelines in Python language that programmers can use to write readable code which makes it easy to use for other users.

  1. What is monkey patching in Python?

Programmers in Python can modify or extend other codes during runtime by using the monkey patching technique. It comes in handy while testing but it is not a good practice to use it in a production environment as debugging the code could become difficult.

  1. What do you mean by list comprehension?

List comprehension is the process of creating a list while performing some operation on the data so that it can be accessed using an iterator.

  1. Suppose the data is stored in HDFS format and you want to find how the data is structured. Which command would you use to find out the names of HDFS keys?

In this case, we could use the following command


Note: HDFS file has been loaded by h5py as hf.

  1.  What will be the output of the below code

word = ‘aeioubcdfg’

print word [:3] + word [3:]

The output for the above code will be: ‘aeioubcdfg’.

In string slicing when the indices of both the slices collide and a “+” operator are applied on the string it concatenates them.

  1. How can you check whether a pandas data frame is empty or not?

The attribute df.empty is used to check whether a pandas data frame is empty or not.

  1. Which python library is used for Machine Learning?

SciKit-Learn is the Python library that can be used for Machine Learning.

  1.  You are given a list of N numbers. Create a single list comprehension in Python to create a new list that contains only those values which have even numbers from elements of the list at even indices. For instance if list[4] has an even value then it has to be included in the new output list because it has an even index but if list[5] has an even value it should not be included in the list because it is not at an even index.

[x for x in list [1::2] if x%2 == 0]

The above code will take all the numbers present at even indices and then discard the odd numbers.

  1. What will be the output of the given code?

 list= [‘a’,’e’,’i’,’o’,’u’]

print list [8:]

The output for the above code will be an empty list []. Most of the people might confuse the answer with an index error because the code is attempting to access a member in the list whose index exceeds the total number of members in the list. The reason being the code is trying to access the slice of a list at a starting index which is greater than the number of members in the list.

  1. How would you import a decision tree classifier in sklearn?

from sklearn.tree import DecisionTreeClassifier

  1. You want to read a website which has URL as “”. How would you perform this task?

urllib2.urlopen( OR



I hope these questions & model answers will help you in your interview prep. Do let me know how your interview goes in the comments below!

If you’re looking to master the Python programming language, there is no substitute for an instructor-led classroom training course. Check out our portfolio of advanced Python courses here.

Related Topics

7 Steps to Mastering Machine Learning with Python

Python vs R for Machine Learning – Which Is Better?

About Sourav Gorai

A research analyst focusing on emerging technologies in IT, Sourav regularly covers the latest developments & industry trends with a focus on data science, AI/machine learning, cloud computing and allied domains.

Posts by Sourav Gorai

One thought on “Top 25 Python Questions & Answers for Data Science Interviews

  1. Hi Man,

    Hip Hip Hooray! I was always told that slightly slow in the head, a slow learner. Not anymore! It’s like you have my back. I can’t tell you how much I’ve learnt here and how easily! Thank you for blessing me with this effortlessly ingestible digestible content.

    I am currently using python 3.6.5 with notepad++ for a text editor. I have tryed to create several programs but whenever the console boots it crashes. Even with a simple hello world, it always crashes. Any help?
    Note: the Hello World program I have authored is like so:
    print(‘Hello World’)

    Follow my new blog if you interested in just tag along me in any social media platforms!

    Thank you,

    I am trying to build a Trac Subtickets plugin to use it in my environment.

    I downloaded the source and following the build instructions:
    python bdist_egg

    The error below is presented:
    build\bdist.win32\egg\tracsubtickets\ to web_ui.pyc
    SyntaxError: (‘invalid syntax’, (‘build\\bdist.win32\\egg\\tracsubtickets\\web_u’, 282, 44, ‘ inh = {f: ticket[f] for f in opt_inherit}\n’))

    What a brilliant post I have come across and believe me I have been searching out for this similar kind of post for past a week and hardly came across this.

    Super likes !!! for this amazing post. I thinks everyone should bookmark this.

    Ajeeth Kapoor

Leave a Reply

Your email address will not be published. Required fields are marked *