Thursday, 29 October 2015

What is the difference between Cassandra, Hadoop Big Data, MongoDB, CouchDB? Cassandra Interview Questions for Experinced

Interview Question: What is the difference between Cassandra, Hadoop Big Data, MongoDB, CouchDB?

Answer:
Hadoop

  • Is an entire ecosystem of integrated distributed computing tools, at the core of which are a file system (HDFS) and a programming framework (Map-Reduce).
Cassandra

  • Is a NoSQL data store based on a key-value pairing system, where value is then further structured into a columnar like store.

MongoDB

  • A NoSQL data store based on key-value pairing system where value is JSON documents.  Has its own unique querying language.

CouchDB

  • A NoSQL data store based on key-value pairing system where value is JSON documents.  Uses a combination of HTTP, Javascript, and map-reduce for querying.

Cassandra, Mongo, and Couch are pretty similar in that they are key-value based NoSQL data stores.  They each have their advantages and disadvantages.  If you're interested in one, you should probably at least have a good understanding of when to use one NoSQL vs another.

Hadoop is a much bigger in scope to learn than the others because it is comprised of many different components, including its own columnar (hBase) and SQL like data storage (Hive) platforms.

As a side note, I wouldn't pay a dime to "learn" these technologies.  They're all open source and there's no shortage of examples or documentation available for all of them.

Reference: Quora.com

Cassandra interview questions: Difference between Cassandra's schema and RDBMS schema?

The following table lists down the points that differentiate the schema of Cassandra from that of an RDBMS.

S.NRDBMSCassandra
1RDBMS deals with structured data.Cassandra deals with unstructured data.
2It has a fixed schema.Cassandra has a flexible schema.
3In RDBMS, a table is an array of arrays. (ROW x COLUMN)In Cassandra, a table is a list of “nested key-value pairs”. (ROW x COLUMN key x COLUMN value)
4Database is the outermost container that contains data corresponding to an application.Keyspace is the outermost container that contains data corresponding to an application.
5Tables are the entities of a database.Tables or column families are the entity of a keyspace.
6Row is an individual record in RDBMS.Row is a unit of replication in Cassandra.
7Column represents the attributes of a relation.Column is a unit of storage in Cassandra.
8RDBMS supports the concepts of foreign keys, joins.Relationships are represented using collections.

Wednesday, 28 October 2015

What is the difference between Apache Storm and Apache Spark? Hadoop interveiw questions


  • Apache Storm operates on data in motion (continuous stream of data). The real time nature is due to its ability to operate on streaming data (data flowing through a set of queries).


  • Apache Spark operates on data at rest. Its real time nature is due to its ability to perform computations on data (RDD) in real time, these are still batch computations like Hadoop.


  • Spark Streaming however combines both where it treats streaming computations as a series of deterministic batch computations on small time intervals.


Monday, 26 October 2015

SQL SERVER VERY BASIC INTERVIEW QUESTIONS FOR FRESHER

SQL SERVER INTERVIEW QUESTIONS WITH ANSWERS FOR FRESHER (for below 1 years experience)

1). What is SQL? 
Ans: SQL (Structured Query Language) is used to perform operations on the records stored in database such as updating records, deleting records, creating and modifying tables, views etc.

2). What's a Database? 
Ans: A collection of stored data objects.

3). What is RDBMS?
Ans: RDBMS stands for Relational Database Management Systems. RDBMS is the basis for SQL, and for all modern database systems such as MS SQL Server, IBM DB2, Oracle, MySQL, and Microsoft Access.The data in RDBMS is stored in database objects called tables.

4). What's a Table? 
Ans: Store data; core component of all databases.

5). What are Views? 
Ans: Virtual tables; usually

6). What are Stored Procedures? 
Ans: Commands that can be executed to make changes and return data.

7). What are Functions? 
Ans: Commands that can be executed to only return data.

8). What are Constraints? 
Ans: Ensure valid values.

9). Primary Keys ?
Ans: Unique values required for each row, designed to uniquely identify each row.

10). Foreign Keys ?
Ans: Used to create relationship between tables. Refers to the column pointing to the primary key.

11). What is the difference between a primary key and a unique key?
Ans: A primary key created a clustered index that does not allow NULLs. A unique key creates a non-clustered index which allows one NULL.

12). Why use Joins? 
Ans: Data is divided across multiple tables, combine tables to make data "human readable".

Friday, 9 October 2015

TOP 10 SQL SERVER CONSTRAINTS RELATED INTERVIEW QUESTIONS

Hey developers, this time I come with most important SQL Server topic, that is known as SQL Server Constraints. If you are going to attend SQL Server interview then you must face question related to SQL Server constraint. So this article for you, here are most top SQL server constraints related interview questions. Lest start...

SQL Constraints
NOT NULL - Surely field gets value for every row
DEFAULT- If u didnt give a value it default value will be given to field.
PRIMARY KEY- not null+unique
FOREIGN KEY- references a column of another table(mostly primary key)
UNIQUE- All the field values must be different, but it allow one null value.
CHECK CONSTRAINT-kinda integrity constraint (specifies a requirement that must be met by each row in database)

1). Define Constraints?
A constraint is a table column property with performs data validation. Using constraints, you can maintain data integrity by preventing invalid data from being entered.

2). What do you understand by Data integrity?
Data integrity is the consistency and accuracy of the data which is stored in a database.

3). Can you add constraints to a table that already has data?
Yes, But it also depend on data, like if a column containing null values and you adding not null constraint then first you need to replace all null with some values.

4). How many primary keys can exist on a table? 
One

5). What is a Foeign Key?
A FK in one table points to a PK in another table
It prevents any actions that would destroy links between tables with the corresponding data values
FK are used to enforce referential integrity

6). Difference between Primary key and Unique key constraints?
1) Unique constraints will allow a null value. If a field is nullable then a unique constraint will allow at most one null value.
2) SQL server allows many unique constraints per table where it allows just primary key per table.

7). Can we apply Uniquey key constraints on multiple columns?
Yes! Unique key constraints can be applied on a composit of multiple fields to ensure quiqueness of records.
Example : City + State in the StateList table

8). When you create an Unique key constraints then by default which index will be created by DB?
Nonclustered index would be created automatically when you will create a unique key constraints.

9). When you create an Primary key constraints then by default which index will be created by DB?
Clustered index would be created automatically when you will create a Primary key constraints.

10). What do you understand by Default constraints?
A default constraint enters a value in a column when one is not specified in the Insert or Update statement.

11). What are the type of data integrity?
In relational database, there are three type of integrity
1. Domain Integrity( data type, check constraints)
2. Entity Integrity (primary key, unique constraints)
3. Referential Integrity (handled by foregn key constraints)

12). If you does't want to check the referential integrity at the time you create the foreign key then which keyword you will use?
Then I will use WITH NOCHECK

Thursday, 8 October 2015

SQL SERVER STORAGE/SIZE/CAPACITY RELATED INTERVIEW QUESTIONS ANSWERS

1). What is the fundamental unit of storage in SQL Server data files and it's size? 
Ans: A page with a size of 8k

2). How many (maximum) no. of columns can be created in a MS SQL Table?
Ans: Max Columns per 'nonwide' table: 1,024
Max Columns per 'wide' table: 30,000

3). What is the difference between Wide and Nonwide tables in SQL Server?
Ans: 1) Wide table can contain 30,000 columns, Non-wide table(basic table) can contain only 1024 columns.
2) Wide Tables are considered to be denormalized tables, Non-wide tables are considered to be Normalized tables.
3) Wide tables are used in OLAP systems, Narrow tables are used in OLTP system.
4) Wide table is new feature in SQL Server 2008. To over come the problem of having only 1024 columns in Narrow tables.
5) Wide tables don't work with transactional or merge replication, but Non-wide can work.

4). Maximum how many rows can be in the SQL Server tables?
Ans: According to Microsoft specification:
Rows per table: Limited by available storage

But there are some cases where SQL Server will prevent you from adding more rows
  • If you have an IDENTITY column and you hit the end of the range for the data type, e.g. 255 for TINYINT, 2,147,483,647 for INT, some ungodly number that starts with a 9 - possibly the number of inches to the sun and back - for BIGINT, etc. When you try to insert the next row, you'll get error message 815 about overflowing the type.
  • If you have a heap with a non-unique index, or a clustered index that is not unique, you won't be able to store more than 2 * 2,147,483,647 unique index key combinations. When you try to insert (2 * 2,147,483,647) + 1 rows with a value of 1 in an INT column that is the only column in a clustered index, you will get error message 666 about exhausting the uniqueifier. This is because the uniqueifier (which helps SQL Server identify a row when there is no true key) is only 4 bytes, which means it can't exceed the capacity of an INT (it does use both positive and negative, unlike IDENTITY unless configure it as such, which is why you get double). Now why you would ever do this, <shrug>... but you could.
  • In the VLDB space, a database can only be 524,272 terabytes. Again a very edge case, but if you have a humongous data warehouse then obviously at some point the number of rows - depending on row size - will put you near this limit.
5). What is the maximum size of a varchar(max) variable?
Ans: Maximum size for a varchar(max) is 2GB, or looked up a more exact figure (2^31-1, or 2147483647).

6). What are the difference Between varchar(8000) and varchar(max)?
  • Varchar(8000) stores a maximum of 8000 characters. Varchar(max) stores a maximum of 2 147 483 647 characters.
  • VARCHAR(MAX) uses the normal datapages until the content actually fills 8k of data as varchar(8000). When overflow happens, data is stored as old TEXT, IMAGE and a pointer is replacing the old content.
  • Columns that are of the large object (LOB) data types ntext, text, varchar(max), nvarchar(max), varbinary(max), xml, or image cannot be specified as key columns for an index
  • VARCHAR(MAX) has some ambiguity, if the size of the cell is < 8000 chars, it will be treated as Row data. If it's greater, it will be treated as a LOB for storage purposes. You can know this by querying RBAR.
7). How can i query my sql server to only get the size of database?
Ans: Use "YourDatabaseName"
exec sp_spaceused

8). What would be the LEN and DATALENGTH of NULL value in SQL Server?
Ans: Both above function will return NULL as the length of NULL.

9). How much size “Null” value takes in SQL Server?
Ans:
  • If the field is fixed width storing NULL takes the same space as any other value - the width of the field.
  • If the field is variable width the NULL value takes up no space.
10). What would be the output of the following script?
Select LEN('A value') --Without space at end
Select LEN('A value  ') --With 2 space at end
Ans: Both will return 7 because LEN function not including trailing spaces in SQL Server.

11). How you will find the LEN in above case?
Ans: We can use following tick
Select LEN('A value  ' + 'x') - 1

12). Difference between Len() and DataLength()?
Ans: DATALENGTH()- returns the length of the string in bytes, including trailing spaces.
LEN()- returns the length in characters, excluding trailing spaces.

For example
SELECT LEN('string'), LEN('string '), DATALENGTH('string'), DATALENGTH('string '),
LEN(N'string'), LEN(N'string '), DATALENGTH(N'string'), DATALENGTH(N'string ')

will return 6, 6, 6, 9, 6, 6, 12, 18

What is the difference between Wide and Nonwide tables in SQL Server?

SQL SERVER INTERVIEW QUESTIONS :

What is the difference between Wide and Non-wide tables in SQL Server?

  • 1) Wide table can contain 30,000 columns, Non-wide table(basic table) can contain only 1024 columns.
  • 2) Wide Tables are considered to be denormalized tables, Non-wide tables are considered to be Normalized tables.
  • 3) Wide tables are used in OLAP systems, Narrow tables are used in OLTP system.
  • 4) Wide table is new feature in SQL Server 2008. To over come the problem of having only 1024 columns in Narrow tables.
  • 5) Wide tables don't work with transnational or merge replication, but Non-wide can work.

Find job here...