Saturday, 11 November 2017

Top 100+ Machine Learning Interview Questions Answers PDF

What is the definition of learning from experience for a computer program?
A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.



Explain what is Machine learning?
Machine learning is the field of study that gives computers the avility to learn without being explicitly programmed.
OR
The acquisition of knowledge or skills through study or experience by a machine.
OR
The ability for machines to learn without being explicitly programmed.

What are different types of learning?
Supervised learning
Unsupervised learning
Semisupervised learning
Reinforcement learning
Transduction
Learning to Learn

What are the differences between artificial intelligence, machine learning and deep learning?
Artificial Intelligence: artificial intelligence. As the name implies, it means to produce intelligence in artificial ways, in other words, using computers.


Machine Learning: This is a sub-topic of AI. As learning is one of the many functionalities of an intelligent system, machine learning is one of the many functionalities in an AI.
Deep learning: Deep learning is the specific sub-field in machine learning involving making very large and deep (i.e. many layers of neurons) neural networks to solve specific problems. It is the current “model of choice” for many machine learning applications.


What are some popular algorithms of Machine Learning?
Decision Trees
Neural Networks (back propagation)
Probabilistic networks
Nearest Neighbor
Support vector machines(SVM)

Explain unsupervised learing?
In unsupervised learning we only have xi values, but no explicit target labels.

Explain classification?
In machine learning, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.

What is reinforcement Learning?
The goal is to develop a system (agent) that improves its performance based on interactions with the environment. Since the information about the current state of the environment typically also includes a so-called reward signal, we can think of reinforcement learning as a field related to supervised learning. However, in reinforcement learning this feedback is not the correct ground truth label or value, but a measure of how well the action was measured by a reward function. Through the interaction with the environment, an agent can then use reinforcement learning to learn a series of actions that maximizes this reward via an exploratory trial-and-error approach or deliberative planning. A popular example of reinforcement learning is a chess engine.

Describe the relationship between all types of machine learning and particularly the application of unsupervised.
In supervised learning, we know the right answer beforehand when we train our model, and in reinforcement learning, we define a measure of reward for particular actions by the agent. In unsupervised learning, however, we are dealing with unlabeled data or data of unknown structure. Using unsupervised learning techniques, we are able to explore the structure of our data to extract meaningful information without the guidance of a known outcome variable or reward function.

What is one of the key ingredients of supervised machine learning algorithms?
to define an objective function that is to be optimized during the learning process. This objective function is often a cost function that we want to minimize.

What is a support vector machine?
Maximize the minimum distance of all errors.
e.g.
The dish is good by itself but to enhance the dish, you put the most amount of salt in a dish without it tasting too salty

What do we call a learning problem, if the target variable is continuous?
When the target variable that we're trying to predict is continuous, the learning problem is also called a regression problem.

What do we call a learning problem, if the target variable can take on only a small number of values?
When y can take on only a small number of discrete values, the learning problem is also called a classification problem.

How do we measure the accuracy of a hypothesis function?
We measure the accuracy by using a cost function, usually denoted by J.

Describe variance and bias in what they measure?
Variance measures the consistency (or variability) of the model prediction for a particular sample instance if we would retrain the model multiple times, for example, on different subsets of the training dataset. We can say that the model is sensitive to the randomness in the training data. In contrast, bias measures how far off the predictions are from the correct values in general if we rebuild the model multiple times on different training datasets, bias is the measure of the systematic error that is not due to randomness.

Describe the benefits of regularization?
One way of finding a good bias-variance tradeoff is to tune the complexity of the model via regularization. Regularization is a very useful method to handle collinearity (high correlation among features), filter out noise from data, and eventually prevent overfitting. The concept behind regularization is to introduce additional information (bias) to penalize extreme parameter weights.

What is the definition of a cost function of a supervised learning problem?
Takes an average difference of all the results of the hypothesis with inputs from x's and the actual output y's.
J(β0,β1)=∑(xi,yi)∈X×Y(yi−y^(xi))2=∑(xi,yi)∈X×Y(yi−(β0+β1xi))2

Explain Random forest?
Random forest is a collection of trees, hence the name 'forest'! Each tree is built from a sample of the data. The output of a RF is the model of the classes (for classification) or the mean prediction (for regression) of the individual trees.

What are training algorithms in Machine learning?
Training algorithms gives a model h with Solution Space S and a training set {X,Y}, a learning algorithm finds the solution that minimizes the cost function J(S)

Explain Local minima?
The smallest value of the function. But it might not be the only one.

What is multivariate linear regression?
Linear regression with multiple variables.

How can we speed up gradient descent?
We can speed up gradient descent by having each of our input values in roughly the same range.

What is the decision boundary given a logistic function?
The decision boundary is the line that separates the area where y = 0 and where y = 1. It is created by our hypothesis function.

What is underfitting? p
Underfitting, or high bias, is when the form of our hypothesis function h maps poorly to the trend of the data.

What usually causes underfitting?
It is usually caused by a function that is too simple or uses too few features.

What is overfitting? p
Overfitting, or high variance, is caused by a hypothesis function that fits the available data but does not generalize well to predict new data.

What usually causes overfitting?
It is usually caused by a complicated function that creates a lot of unnecessary curves and angles unrelated to the data.

How can we avoid overfitting?
Stop growing when data split is no more statistically significant OR grow tree & post-prune.

What is features scaling?
Feature scaling is a method used to standardize the range of independent variables or features of data. In data processing, it is also known as data normalization and is generally performed during the data preprocessing step.

What are the advantages of data normalization?
Few advantages of normalizing the data are as follows:
1. It makes your training faster.
2. It prevents you from getting stuck in local optima.
3. It gives you a better error surface shape.
4. Wweight decay and bayes optimization can be done more conveniently.

What range is used for feature scaling
0-1

What is the formula for feature scaling?
(x-xmin)/(xmax-xmin)

What two algorithms does features scaling help with?
K-means and SVM RBF Kernal

What two algorithms does features scaling NOT help with?
Linear regression and decision trees

What is a good clustering algorithm?
K-means

In a basic sense, what are neurons?
Neurons are basically computational units that take inputs, called dendrites, as electrical inputs, called "spikes", that are channeled to outputs , called axons.

What is a neural network?
Takes an input layer -> hidden layer of logistic regression -> outputs of the hidden layer are binary that go to the output layer
e.g.
is that a house cat? input layer is whiskers, fur, paws, large. Hidden layer finds that cats are small (so the output of the hidden layer is 1, 1, 1 ,0). Because not all features (outputs) from the hidden layers are true, it's not a house cat.


What is Regression Analysis?
We are given a number of predictor (explanatory) variables and a continuous response variable (outcome), and we try to find a relationship between those variables that allows us to predict an outcome.

What are the dendrites in the model of neural networks?
In our model, our dendrites are like the input features.

What are the axons in the model of neural networks?
In our model, the axons are the results of our hypothesis function.

What is the bias unit of a neural network?
The input node x0 is sometimes called the "bias unit." It is always equal to 1.

What are the weights of a neural network?
Using the logistic function, our "theta" parameters are sometimes called "weights".

What is the activation function of a neural network?
The logistic function (as in classification) is also called a sigmoid (logistic) activation function.

How do we label the hidden layers of a neural network?
We label these intermediate or hidden layer nodes. The nodes are also called activation units.

What is the kernal method?
When you can't use logistical regression because there isn't a clear delineation between the two groups, you need to draw a curved line. Multiply x & y to separate groups on a 3d plane.
e.g.
monkey in the middle. If there are two people on either side of the person in the middle, how do you draw a straight line to separate the two groups (e.g. logistical regression)? You can't. You have to draw a curved line.

What's the motivation for the kernel trick?
To solve a nonlinear problem using an SVM, we transform the training data onto a higher dimensional feature space via a mapping function and train a linear SVM model to classify the data in this new feature space. Then we can use the same mapping function. to transform new, unseen data to classify it using the linear SVM model.
However, one problem with this mapping approach is that the construction of the new features is computationally very expensive, especially if we are dealing with high-dimensional data. This is where the so-called kernel trick comes into play

Give the setup of using a neural network.
• Pick a network architecture.
• Choose the layout of your neural network.
• Number of input units; dimension of features x i.
• Number of output units; number of classes.
• Number of hidden units per layer; usually more the better.

How does one train a neural network?
1. Randomly initialize the weights.
2. Implement forward propagation.
3. Implement the cost function.
4. Implement backpropagation.
5. Use gradient checking to confirm that your backpropagation works.
6. Use gradient descent to minimize the cost function with the weights in theta.

How can we break down our decision process deciding what to do next? 
• Getting more training examples: Fixes high variance.
• Trying smaller sets of features: Fixes high variance.
• Adding features: Fixes high bias.
• Adding polynomial features: Fixes high bias.
• Decreasing lambda: Fixes high bias.
• Increasing lambda: Fixes high variance.

What issue poses a neural network with fewer parameters?
A neural network with fewer parameters is prone to underfitting.

What issue poses a neural network with more parameters?
A large neural network with more parameters is prone to overfitting.

What is the relationship between the degree of the polynomial d and the underfitting or overfitting of our hypothesis?
• High bias (underfitting): both J train(Θ) and J CV(Θ) will be high. Also, J CV(Θ) is approximately equal to J train(Θ).
• High variance (overfitting): J train(Θ) will be low and J CV(Θ) will be much greater than J train(Θ).

Describe Logistic Regression vs SVM.
In practical classification tasks, linear logistic regression and linear SVMs often yield very similar results. Logistic regression tries to maximize the conditional likelihoods of the training data, which makes it more prone to outliers than SVMs. The SVMs mostly care about the points that are closest to the decision boundary (support vectors). On the other hand, logistic regression has the advantage that it is a simpler model that can be implemented more easily. Furthermore, logistic regression models can be easily updated, which is attractive when working with streaming data.

Give an overview of the decision tree process.
We start at the tree root and split the data on the feature that results in the largest information gain (IG). In an iterative process, we can then repeat this splitting procedure at each child node until the leaves are pure. This means that the samples at each node all belong to the same class.

Describe parametric vs nonparametric models?
Machine learning algorithms can be grouped into parametric and nonparametric models. Using parametric models, we estimate parameters from the training dataset to learn a function that can classify new data points without requiring the original training dataset anymore. Typical examples of parametric models are the perceptron, logistic regression, and the linear SVM. In contrast, nonparametric models can't be characterized by a fixed set of parameters, and the number of parameters grows with the training data. Two examples of nonparametric models that we have seen so far are the decision tree classifier/random forest and the kernel SVM.

What is feature extraction?
A method to transform or project the data onto a new feature space. In the context of dimensionality reduction, feature extraction can be understood as an approach to data compression with the goal of maintaining most of the relevant information.

Explain PCA in a nutshell.
It aims to find the directions of maximum variance in high-dimensional data and projects it onto a new subspace with equal or fewer dimensions that the original one. The orthogonal axes (principal components) of the new subspace can be interpreted as the directions of maximum variance given the constraint that the new feature axes are orthogonal to each other

What is Exploratory Data Analysis?
(EDA) is an important and recommended first step prior to the training of a machine learning model. For example, it may help us to visually detect the presence of outliers, the distribution of the data, and the relationships between features.

What is word stemming?
The process of transforming a word into its root form that allows us to map related words to the same stem

What is OLS?
Ordinary Least Squares (OLS) method is to estimate the parameters of the regression line that minimizes the sum of the squared vertical distances (residuals or errors) to the sample points.

What are residual plots?
Since our model uses multiple explanatory variables, we can't visualize the linear regression line (or hyperplane to be precise) in a two-dimensional plot, but we can plot the residuals (the differences or vertical distances between the actual and predicted values) versus the predicted values to diagnose our regression model. Those residual plots are a commonly used graphical analysis for diagnosing regression models to detect non-linearity and outliers, and to check if the errors are randomly distributed.

What is the elbow method?
A graphical technique to estimate the optimal number of clusters k for a given task. Intuitively, we can say that, if k increases, the distortion (within-cluster SSE) will decrease. This is because the samples will be closer to the centroids they are assigned to. The idea behind the elbow method is to identify the value of k where the distortion begins to increase most rapidly,

The two main approaches to hierarchical clustering are?
agglomerative and divisive hierarchical clustering

What is Deep learning?
It can be understood as a set of algorithms that were developed to train artificial neural networks with many layers most efficiently.

What does the feedforward in feedforward artificial neural network mean?
Feedforward refers to the fact that each layer serves as the input to the next layer without loops, in contrast to recurrent neural networks for example.

What is gradient checking?
It is essentially a comparison between our analytical gradients in the network and numerical gradients, where a numerically approximated gradient =( J(w + epsilon) - J(w) ) / epsilon, for example.

What are Recurrent Neural Networks?
Recurrent Neural Networks (RNNs) can be thought of as feedforward neural networks with feedback loops or backpropagation through time. In RNNs, the neurons only fire for a limited amount of time before they are (temporarily) deactivated. In turn, these neurons activate other neurons that fire at a later point in time. Basically, we can think of recurrent neural networks as MLPs with an additional time variable. The time component and dynamic structure allows the network to use not only the current inputs but also the inputs that it encountered earlier.

Blogger Widgets

Thursday, 14 September 2017

Tuesday, 12 September 2017

Top 12 Things Not to Say During an Interview : Interview tips

Bio:
Susan Ranford is an expert on job market trends, hiring, and business management. She is the Community Outreach Coordinator for New York Jobs. In her blogging and writing, she seeks to shed light on issues related to employment, business, and finance to help others understand different industries and find the right job fit for them.

Things Not to Say During an Interview
Interviews are places where you have to watch your tongue every second. You don’t want to say too much, the wrong thing, or ramble on incessantly. Some topics should be totally off limits during an interview. Be sure to keep these things on your mind and off your tongue at your next interview.

  • Never admit to being nervous.
Being nervous and admitting it are two different things during an interview.

You should show confidence in yourself first. Hide the case of nerves as best you can, and do not mention being nervous. The interviewer is looking for a confident candidate, and that can be you.

It may seem endearing to admit it, like you’re nervous and excited for the opportunity. But ultimately, it’s better to appear confident and in control of your emotions.

  • Never mention entrepreneurial aspirations during an interview.
Don’t tell an interviewer than you want to be your own boss.

Mentioning that you want to be your own boss puts you in a unique category that you really don’t want to be in. According to Ken Sundheim, it immediately lists you as a threat to the company because you could be there to learn trade secrets or be seen as a potential loss that leads to another day of interviewing. If you want to be hired, don’t tell them you want to work for yourself. Explain why you want to work for them.

  • Don’t be too eager to work.
If you are looking to be hired, being too eager to work can work against you.

Instead of being available for any job, be job specific when you apply. If you are willing to do anything, the interviewer may see you as desperate and not having specialized skills. If you’ll do anything, you’re not necessarily good at something. Make yourself valuable to the company by expressing yourself and your talents well.

  • Don’t Give Apologies for Lack of Experience.
If your resume doesn’t show years of experience in the industry that you are trying to break into, don’t make apologies for this lack of experience.

Build on your strengths rather than dwelling on your weak areas. This advice applies to the mid-career changer as well as the new graduate. If you don’t have the years of experience that the interviewer is requiring, mention any skills that transfer and make you the best qualified person for the job.

  • Don’t tell them to look at your resume.
If you are asked a question, be sure to answer the question.

Don’t refer the interviewer back to your resume. They want to hear an answer directly from you. You are a living, breathing person. Your resume is a piece of paper.

Reminding them that you listed it on your resume is disrespectful, and it is a definite don’t.

  • Don’t talk about your job search.
Let them know how you found them, but don’t talk about the hours you spent looking for other opportunities on local job sites. They don’t care about your job search, they care about potentially hiring you.

Move the focus to you and your skills as much as possible, not your inability to find other opportunities.

  • Don’t wait for questions you want to answer.
Rehearsing and practicing for an interview gives you confidence for answering certain questions, but don’t be listening and anticipating just those questions.

You have to be attentive and able to give answers to all of the questions that are asked. Being able to carry on a conversation without appearing to have it scripted is important to making it past round one in the interview process.

  • Don’t use clichés.
Be original with your words. Always find new and positive ways of saying things. Instead of using buzzwords and clichés, describe yourself with adjectives and phrases will showcase your creativity and ability to think independently.

The interview is the place to set yourself apart from the crowd, and your conversation is the most obvious way to do this.

  • Definitely lose the filler words.
Any word or phrase that you use to fill in a sentence while you are thinking should not be spoken during an interview. These words include ‘like’, and sounds such as ‘um’, and ‘er’. These do not help you communicate your message clearly and succinctly. You should remain silent rather than filling the gaps with these sounds.

Filler words are often words you don’t even remember saying. A pro tip is to record yourself speaking. Then, watch to see which filler words that you need to eliminate from your vocabulary.

  • Stay on topic.
If you have a tendency to ramble, put a lid on it during the interview. Feel free to be a great storyteller around your friends and family, but not during an interview unless the story you are telling is relevant to the job at hand.

If in your work experiences you achieved great success, then definitely share that story when the time is right. However, if you just can’t way to tell someone about your weekend plans, keep that story to yourself. 

  • Never ask what the company does.
If you have to ask what the company does that you are interviewing with, you haven’t done your homework. This question is the biggest turnoff for recruiters and can summarily end what was a great interview up until you asked this fatal question. If you ask this question, odds are really good that you won’t be hired.

  • Realize the power of your words.
Words are powerful, especially during an interview. Choose them carefully, and you’ll increase the odds of landing the job. If you’ve already been to an interview and have said all of the wrong things, learn from your mistakes.

If you slip up and say one of these things, realize it. Next time you’ll know what not to say.

Sunday, 10 September 2017

Top 20 Mixpanel Analytics Interview Questions with Answers

Here we come with very popular analytics tool interview questions with is know as Mixpanel. So lets start.

What is know about Mixpanel, explain?
Ans: Mixpanel is a business analytics service and company. It tracks user interactions with web and mobile applications and provides tools for targeted communication with them. Its toolset contains in-app A/B tests and user survey forms. Data collected is used to build custom reports and measure user engagement and retention. Mixpanel works with web applications, in particular SaaS, but also supports mobile apps. As of January 2016 the company had more than 230 employees, but had to fire 18 people due to overhiring.

In which format Mixpanel store Data?
Ans: JSON

Why Mixpanel even there are many tools in market?
Ans: Mixpanel provide a solution that lets businesses track the specific user actions that are important to your business questions, along with detailed information about these actions and users. We can track these metrics on any platform, and the customer gets maximum flexibility and granularity in terms of what details they track. Once the data is in, our reports let customers ask very complex, specific questions of this data.

Explain People analytics?
People analytics helps you understand and re-engage your customers,Imagine being able to understand who your users are, see what they do before or after they sign up, re-engage them with messages, and dig deep into your customer revenue. Now it’s possible, all in one place, with People Analytics.

Explain what is Segmentation?
Ans: Segmentation allows you to view top events on your app and easily break down complex events in mixpanel. You have the ability to drill down by an unlimited number of properties to gather instant insight into these key actions on your app. You can choose any individual event in MP, compare multiple events and see total events, unique users firing these events and average number of events per user. We also give you multiple options to view this data.

Mixpanel top features to analyse your people?
Ans:
Drill into your data: With Insights and findout where to focus your resources when building your product.
Visualize your data in different ways:Smooth out noisy results to really understand what’s going on.
Discover insights quickly: When digging into complex questions, don’t get slowed down waiting for an answer.
Bookmarks: Let you save reports that you look at a lot so you can save time.

EXAMPLE QUESTIONS MIXPANEL CAN ANSWER, BY WHICH YOU CAN TAKE DECISION
(to know better about Mixpanel)
Which sources have driven the most mobile installations over time?

Which feature should I invest in further to drive up customer conversion?

Was the ROI on my latest ad social spend campaign more or less than previous campaigns?

How is Mixpanel different from Google analytics?
Ans: Mixpanel differs from Google Analytics in one major way: instead of tracking page views, it tracks the actions people take in your mobile or web application.

How can I export my people profiles into a CSV?
Ans: People profiles currently cannot be exported via the Mixpanel UI; however, you can easily export your people profiles either: Within Mixpanel using one simple query in JQL.

Explain JQL(JavaScript Query Language)?
Ans: JQL – JavaScript Query Language – uses the full power of a robust and popular programming language, JavaScript, to let you analyze your data in Mixpanel. It was designed for performance and flexibility so that developers and data scientists can pull the most valuable insights from their data with ease ‐ no matter how complex the question is.

What are the advantages of using JavaScript for analytics over SQL? JAVASCRIPT VS. SQL
Advantages of using JavaScript for analytics:

  • The full power of a programming language powered by V8 ‐ the JavaScript engine in Chrome
  • Easily express & compose queries that are more understandable
  • A modern & popular programming language amongst developers to quickly get started
  • Flexible to use with unstructured, schema less data

Disadvantages of using SQL for analytics:

  • Meant for rigid schemas for traditional relational databases
  • Difficult to manipulate and transform the data
  • Complex queries become unwieldy to read & compose
  • Limited flexibility due to query functions available in SQL


How do I track a page view in Mixpanale, example?

What is distinct_id?
Ans: Mixpanel can keep track of actions in your application right down to the individual customer level. This is done using a property called distinct_id. The property can (and in most cases should) be included with every event you send to Mixpanel to tie it to a user. Distinct_id plays a vital role across most Mixpanel reporting.

Where can I find my project token?
Ans: Click your name in the upper righthand corner of your Mixpanel project and select Project settings to see your project token for only the project you’re currently viewing.

What data types does Mixpanel accept as Properties?
Ans: String
Numeric
Boolean
Date
List

Explain Activity Feed in Mixpanale?
Make better product decisions by seeing the full story of how individual customers use your product. Activity Feed puts customer behavior into an event-based timeline, so you can follow along as people experience your product, seeing where they get stuck along the way.



Wednesday, 23 August 2017

What is Cached Report in SSRS ?

What is Cached Report in SSRS ?


Cashing is a copy of last executed report and stores it in report server temp DB.

SSRS lets you enable caching for the report and maintain a copy of the processed report  in intermediate format in report server temp DB ,so that if the same report request comes again, the stored copy can be rendered in the desired format and served. This improvement in subsequent report processing can be evident especially in cases where the report is quite large and accessed frequently.

Please Note that the cashed report will continue to show the same data even if the data has changed in the Database until the cashed is refreshed. You can set the expiration date in Report Manager After expiration, a cached report is replaced with a newer version when the user selects the report again.

Thursday, 17 August 2017

What is Snapshot Report in SSRS?

What is Snapshot Report in SSRS?

A Report Snapshot in SSRS is a report that contains layout information and a data-set that is retrieved at a specific point in time. Unlike on-demand reports, which get up-to-date query results when you select them, report snapshots are processed on a schedule and then saved to a report server. When you select a report snapshot for viewing, the report server retrieves the stored report from the report server database, and shows the data and layout that were current for the report at the time the snapshot was created.

Steps to a Report SnapShot :

  • Got to Report Manager,Where RDLs are Deployed.
  • Right Click The RDL and Select Manage.
  • Then select Snapshot Options from left the pane and schedule the snapshot of the report.
 

Wednesday, 16 August 2017

How to Replace Null Values in SSRS Report ?

How to Replace Null Values in SSRS Report ?

You can replace the NULL Values with some Custom value using IIF , IsNothing Function in SSRS.

Just Right Click on TextBox on which you want to replace NULL Value and write an Expression :

=IIF(IsNothing(Fields.ColName.Value),0,Fields.ColName.Value)   [ To Replace with 0]
or
=IIF(IsNothing(Fields.ColName.Value),"Not Available",Fields.ColName.Value)   [To Replace with String]

Friday, 28 July 2017

Sunday, 16 July 2017

What is Report Builder in SSRS?


What is Report Builder in SSRS?

Report Builder is an Report authoring tool use to design ad-hoc reports and to manage the existing reports. you can preview your report in Report builder and publish your report to a reporting services . In short we can say that Report  Builder provides the capability of design, execute and deploy the SSRS reports.

How to display data on single tablix from two datasets in SSRS ?

How to display data on single tablix from two datasets in SSRS ?

We can display data on single tablix from two datasets with the help of LOOKUP function.
So if we have two datasets and we need to display on a single tablix then they must have one common column in both
datasets on which we can join.

Syntax : =LookUp(source_expression,destination_expression,result_expression,dataset)

source_expression specify the name or key to lookup
destination_expression specify the name  or key to match on
result_expression An expression i.e evaluated for the row in the dataset when source_expression=destination_expression
dataset from which we are adding column.

Friday, 14 July 2017

Top 50 DAX Interview Questions Answers PDF : SSAS/Power BI

Here we come with top 50 SSAS/PowerBI DAX interview questions with answers.

Explain what is DAX?
DAX stands for Data Analysis Expressions, and it is the formula language simply it is a collection of functions, operators, and constants that can be used in a formula, or expression in Microsoft SQL Server Analysis Services, Power Pivot in Excel, and Power BI Desktop. Stated more simply, DAX helps you create new information from data already in your model.expressions.

Explain when do you use SUMX() instead of SUM()?
When the expressions to SUM() consits of anything else than a column name. Typically when you want to add or multiply the values in different columns:
SUMX(Orderline, Orderline[quantity], Orderline[price])
SUMX() first creates a row context over the Sales table (see 1 above). It then iterates through this table one row at a time. SUM() is optimized for reducing over column segments and is as such not an iterator.

What do you understand by new CALENDARAUTO() Function in DAX(SSAS)?
CALENDARAUTO function returns a table with a single column named “Date” that contains a contiguous set of dates. The range of dates is calculated automatically based on data in the model.
Example: In this example, the MinDate and MaxDate in the data model are July 1, 2010 and June 30, 2011.
CALENDARAUTO() will return all dates between January 1, 2010 and December 31, 2011.
CALENDARAUTO(3) will actually return all dates between April 1, 2010 and March 31, 2012.

Name any 3 most useful aggregation functions DAX?
DAX has a number of aggregation functions, including the following commonly used functions:
  • SUM
  • AVERAGE
  • MIN
  • MAX
  • SUMX (and other X functions)
These functions work only on numeric columns, and generally can aggregate only one column at a time. However, special aggregation functions that end in X, such as SUMX, can work on multiple columns. These functions iterate through the table, and evaluate the expression for each row.

Which are the three places where an expression can be evaluated and hence a specific context is set?
1. In a pivot table cell. Filter context is set by rows, columns, filters and slicers.
2. In a row cell (calculated column) Row context is set by the row itself.
3. In the measure calculation area of a table. No row, nor filter context is set.

Name any 3 most useful text functions in DAX?
The text functions in DAX include the following:
  • CONCATENTATE
  • REPLACE
  • SEARCH
  • UPPER
  • FIXED
These text work very similarly to the Excel functions that have the same name, so if you're familiar with how Excel handles text functions, you're already a step ahead. If not, you can always experiment with these functions in Power BI, and learn more about how they behave.

How is filter context propagated through relationships?
Filter context automatically propagates following the filtering of the relationship. It always propagates from the one side of the relationship to the many side. In addition, you also have the <u>option of enabling the propagation from the many side to the one side</u>. No functions are available to force the propagation: Everything happens inside the engine in an automatic way, according to the definition of relationships in the data model.

What are Path() and PathLength() functions in DAX?
PATH(): Syntax PATH(<ID_columnName>, <parent_columnName>) Its Returns a delimited text string with the identifiers of all the parents of the current identifier, starting with the oldest and continuing until current.
PATHLENGTH(): Syntax PATHLENGTH(<path>) Its Returns the number of parents to the specified item in a given PATH result, including self.
Example: I think there is no need to explain you can understand it yourself
Ref: Microsoft

What is the difference between DISTINCT() and VALUES() in DAX? 
Both count the distinct values, but VALUES() also counts a possible implictit virtual empty row because of non matching values in a child table. This is usually in a dimension table.

Which function should you use rather than COUNTROUWS(DISTINCT())?
DISTINCTCOUNT()

What is a pattern?
A pattern is a general reusable solution to a commonly occurring problem.In Microsoft Excel, you use patterns every day to build tables, charts, reports, dashboards, and more.

What are DAX patterns?
DAX Patterns is a collection of ready-to-use data models and formulas in DAX, which is the programming language of Power Pivot. Create your Excel data model faster by using a DAX pattern!

Explain RELATED() and RELATEDTABLE()?
RELATED works when you have a row context on the table on the many side of a relationship. RELATEDTABLE works if the row context is active on the one side of a relationship. It is worth noting that both, RELATED and RELATEDTABLE, can traverse a long chain of relationships to gather their result; they are not limited to a single hop.

Explain how a pivot table can be viewd as an MVC system?
1. Model = the Data Model (incl DAX expressions)
2. View = the table (or chart)
3. Controller = rows + columns + filters + slicers
What can you say about automatic filter propagation The filters only ever automatically flow from the "one" side of the relationship to the "many" side of the relationship; from the "arrow" side to the "dot" side; from the lookup table to the data table—whatever terms you use, it's always downhill.
With the lookup tables above and the data tables below, it is a mental cue to help you instantly visualize how automatic filter propagation works.

How does CALCULATE() result in context transition?
1. When in row context it transitions to filter context: the filter on the rows of a specific table propagates through the relationship to the related before the calculation is completed.
E.g. CALCULATE(SUM(OtherTable[column]) in a calculated column.
2. It extends or modifies an existing filter context, by adding a filter as its second parameter.
CALCULATE() always introduces filter context.

What is the difference between MAX and MAXA functions in DAX?
The MAX function takes as an argument a column that contains numeric values. If the
column contains no numbers, MAX returns a blank. If you want to evaluate values that
are not numbers, use the MAXA function.

How are row contexts created?
1. Automatically in a calculated column
2. Programmatically by using iterators.

How are filter contexts created?
1. Automatically by using fields on rows, columns, slicers, and filters.
2. Programmatically by using CALCULATE()

How can you propagate row context through relationships?
Propagation happens manually by using RELATED() and RELATEDTABLE(). These functions need to be used on the correct side of a one-to-many relationship: RELATED() on the many side, RELATEDTABLE() on the one side.

How does SUMMARIZECOLUMNS relate to filtering?
1. SUMMARIZECOLUMNS is not susceptible for outer (external) filters, in contrast to SUMMARIZE
2. you can add a filter (e.g. using FILTER) as a param of SUMMARIZCOLUMNS and it will filter accordingly. It acts as if you've added a filter in a pivot table.

What is the initial filter context?
The initial filter context comes from four areas of a pivot table:
1. Rows
2. Columns
3. Filters
4. Slicers
It is the standard filtering coming from a pivot table before any possible modifications from DAX formulas using CALCULATE().

How to optimize DAX query plan?
Ans: click here


Why don't you use a CALCULATE() in the aggregation expression of a SUMMARIZECOLUMN()?
The CALCULATE() is automatically generated.

What is the difference between having a measure as a second FILTER param and having the original measure expression as a param. FILTER(table, [MEASURE]) vs FILTER(table, SUM(...))?
A measure has always implicit filter context, so the ROW context induced by the FILTER is transferred to filter context in the measure.<br>In case of the expression only, no filter context is imposed on the expression, so the expression is evaluated with an empty filter context! This gives a different result.

DAX nested Functions is equivalent to SQL what?
SQL subqueries

Wednesday, 12 July 2017

What are different types of DataSources in SSRS ?

What are different types of DataSources in SSRS ?

In SSRS we have few options available for choosing the DataSource.

Some of datasources options available are listed below -

Microsoft SQL Server
Microsoft SQL Azure
Microsoft SQL Server Parallel Data Warehouse
OLEDB
Microsoft SQL Server Analysis Services   
Oracle
ODBC
XML
Reporting Server Model
Microsoft SharePoint List
SAP NetWeaver BI
Hyperion Essbase
Teradata

How to display total number of rows in a Single SSRS report?

How to display total number of rows in a Single SSRS report?

We can display the total number of rows in a single SSRS Report with the help of  "CountRows" function. Just inset a textbox and in the expression of this textbox write a code -


NOTE : You can append string with countrows by using "&" for more readability in the Report as I added.

SYNTAX :countrows()


 

Thursday, 6 July 2017

Top 30 Database Testing Interview Questions Answers for Database Testers

Tuesday, 4 July 2017

Repeat Rows N Times According to Column Value in SQL Server : HCL Interview Question

Suppose you have a table #Temp as shown in the below image(left table) and you want to repeat rows based on "NTimes" column value. Then what will you do to generate output as below(right hand side table), asked during HCL interview

Ans:


SELECT A.Name,[NTimes] FROM
(SELECT Name,[NTimes], CAST(('<val>'+ REPLICATE ( Name+'</val><val>' ,[NTimes]-1 ) +'</val>') AS XML) AS X  FROM #Temp) A
CROSS APPLY A.X.nodes('/val') y(z)


Monday, 3 July 2017

How to find Running Total in SSRS ?

How to find Running Total in SSRS ?

We can find running total in SSRS  with the help of a RunningValue Function .
Running Value Function will return a  running aggregate of all non-null numeric values specified by the expression, evaluated for the given scope.

Syntax :  RunningValue(expression, function, scope) 

Example:Right Click the TextBox and write a code in expression



 

Sunday, 2 July 2017

What is Drill Through Report in SSRS ?

Drill Through Reports : Drill through reports is when user click on the values ,redirect the user to another detail  report. Drillthrough reports commonly contain details about an item that is contained in the original summary report .These are standard reports that are accessed through a hyperlink on a textbox in the original report.
 You can make a drill through report by right click on the textbox that you want to be hyperlinked.

Thursday, 29 June 2017

Wednesday, 28 June 2017

What is Spool Operator in SQL Server: SQL Server Performance Tuning

Spool operator scans the input and places a copy of each row in a hidden spool table that is stored in the tempdb database and existing only for the lifetime of the query. If the operator is rewound (for example, by a Nested Loops operator) but no rebinding is needed, the spooled data is used instead of re-scanning the input.
Table Spool is a physical operator.

Friday, 16 June 2017