A Database is a collection of data.
A Database management systems (DBMS) is software system for creating and managing databases.
Database management systems is also called DBMS in short.
- Hierarchical databases.
- Network databases.
- Relational databases.
- Object-oriented databases
- Relational Database Management System(RDBMS) is based on a relational model of data that is stored in tables inside the database.
- Data can be accessed, modify and managed using Structured Query Language (SQL).
- NoSQL, is an alternative to traditional relational databases (RDBMS).
- NoSQL databases are especially useful for working with large sets of distributed, semi-structured, un-structured data.
- MongoDB is one of the best example of most popular NoSQL database.
Key: A key is single or combination of multiple fields used to access, modify data in database.
-
- Candidate key: A Candidate Key is a set of one or more fields/columns that can identify a record uniquely in a table. There can be multiple Candidate Keys in one table
- Primary key: The Minimal set of Candidate key which can uniquely identify a record in table, is called Primary key. A table can have only one primary key and it does not allow null value.
- Alternate key: A Alternate key is a key that can be work as a primary key. Basically it is a candidate key that currently is not primary key.
- Composite key: A Composite Key is a combination of more than one fields/columns of a table. It can be a Candidate key, Primary key
- Unique key: A Unique key is a set of one or more fields/columns of a table that uniquely identify a record in database table, Unique key can have only one NULL value.
- Super key: Super key is a set of one or more than one keys that can be used to identify a record uniquely in a table. Primary key, Unique key, Alternate key are the subset of Super Key.
- Foreign key: A Foreign Key is a field in database table that is Primary key in another table. It can accept multiple null, duplicate values.
- RESTRICT: Rejects the delete or update operation for the parent table.
- CASCADE: will propagate the change when the parent changes. (If you delete a row, rows in constrained tables that reference that row will also be deleted, etc.)
- SET NULL: Delete or update the row from the parent table, and set the foreign key column or columns in the child table to NULL.
Note:- If you specify a SET NULL action, make sure that you have not declared the columns in the child table as NOT NULL. - NO ACTION: A keyword from standard SQL. In MySQL, equivalent to RESTRICT.
- SET DEFAULT: This action is recognized by the MySQL parser, but both InnoDB and NDB reject table definitions containing ON DELETE SET DEFAULT or ON UPDATE SET DEFAULT clauses.
You may also like – Interview Questions Answers
- Primary key cannot have NULL value, while unique key can have NULL values.
- A table can have only one primary key, while a table can have multiple unique key.
- Normalization is the process of organizing (decomposing) the data in a relational database in accordance with a series of normal forms in order
- To Reduce Data Redundancy
- To Improve Data Integrity
- To Remove Insert, Update & Delete Anomalies.
- It was first proposed by Edgar F. Codd
-
-
First Normal Form (1NF):
- A Table (relation) is in 1NF if each columns (attributes) contains an atomic value.
As per the rule of first normal form, an column (attribute) of a table cannot hold multiple values.
Student table which is not in First Normal Form(1NF):
id name subject 11 Jai PHP, Python 12 Shiv Java 13 Ajay C++ 14 Vijay React, Angular 15 Abdul Node js Now, Student table is in First Normal Form(1NF):
id name subject 11 Jai PHP 11 Jai Python 12 Shiv Java 13 Ajay C++ 14 Vijay React 14 Vijay Angular 15 Abdul Node js Although some values are still getting repeated but values for the subject column are now atomic for each row.
By Using the First Normal Form(1NF), data redundancy increases, as there will be many columns with same data in multiple rows but each row as a whole will be unique.
-
Second Normal Form (2NF):
A table is will be in 2NF if it satisfy following conditions:
- It is in First Normal Form(1NF).
- It should not have any Partial Dependency.
id student_id subject_id marks trainer 1 11 1 70 PHP Trainer 2 11 2 75 Python Trainer 3 12 3 80 JavaScript Trainer Now if you look at the Score table, we have a column names trainer which is only dependent on the subject (subject_id).
Now as we just discussed that the primary key for this table is a composition of two columns which is student_id and subject_id but the trainer’s name only depends on subject.
This is Partial Dependency, where an attribute in a table depends on only a part of the primary key and not on the whole key.
Let’s remove Partial Dependency by decomposing tables like below:
subject_id trainer 1 PHP Trainer 2 Python Trainer 3 JavaScript Trainer id student_id subject_id marks 1 11 1 70 2 11 2 75 3 12 3 80
-
-
-
Third Normal Form (3NF):
A table is will be in 3NF if it satisfy following conditions:
- It is in Second Normal Form(2NF).
- It should not have any Transitive Dependency.
-
When a non-prime attribute depends on other non-prime attributes rather than depending upon the prime attributes or primary key, this is called Transitive Dependency.
Advantage of removing Transitive Dependency:
-
- Data duplication will reduce.
- Data integrity will be achieve.
-
Boyce & Codd normal form (BCNF):
Boyce and Codd Normal Form is a higher version of the Third Normal form. This form deals with certain type of anomaly that is not handled by 3NF
- DDL – Data Definition Language
- DML – Data Manipulation Language
- DQL – Data Query Language
- DCL – Data Control Language
- TCL – Transaction Control Language
- HAVING: It is used to add a condition with GROUP BY. HAVING clause used to select rows after grouping.
- WHERE: It is used to add a condition to filter data. WHERE clause used to select rows before grouping.
An index is used to speed up the performance of SQL queries, It can be created on column or group of columns
A Trigger is a set of SQL statement that are associated with a table which automatically get executes before/after insert, update or delete operations.
- Before/After Insert
- Before/After Update
- Before/After Delete
- code
- source
- CREATE TRIGGER trigger_name
- when
- ON table_name
- FOR EACH ROW
- BEGIN
- //SQL Statements
- END
#Example: Following is an example of a trigger on employee table which updates the increment column with the incremented salary of an employee.
- code
- source
- CREATE TABLE `employee` (
- `id` INT(11) NOT NULL AUTO_INCREMENT,
- `name` VARCHAR(250) NOT NULL,
- `manager_id` INT(11) NULL DEFAULT NULL,
- `salary` DECIMAL(10,2) NOT NULL DEFAULT ‘0.00’,
- `increment` DECIMAL(10,2) NOT NULL DEFAULT ‘0.00’,
- `created_at` TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
- PRIMARY KEY (`id`)
- );
- code
- source
- DELIMITER $$
- CREATE OR REPLACE TRIGGER empTrigger
- BEFORE UPDATE ON employee
- FOR EACH ROW
- BEGIN
- set new.increment = new.salary – old.salary;
- END $$
- DELIMITER ;
Update all employees salary by 10%. once the following query will get executed after update empTrigger will also get executed and update increment column with increased salary of each employees.
- code
- source
- UPDATE employee SET salary = salary + (salary * (10 /100) );
A stored procedure is a set of SQL statements, invoke whenever we call it.
- It is Faster because –
- It is Pre-compiled
- It reduce network traffic
- It is reusable
- It can handle complex operation
- It support nested stored procedure
- Secure – It’s security is high. No body can see the code because it stay in Database Server.
- It’s Debugging is difficult
- It’s Difficult to maintain
- Memory usage increased
A view is a virtual table whose content are defined by a query and stored in database. A view contains rows and columns, just like a real table.
- code
- source
- //Create View
- CREATE VIEW VIEW_NAME
- AS
- SQL_QUERY;
- //Select Data from View
- Select * from VIEW_NAME;
- //Note – Replace VIEW_NAME & SQL_QUERY with their actual value.
Note: As data update in tables view update automatically.
The concept of CAP theorem was initially proposed by Eric Brewer.
A distributed database system can only have only two proporties at a time among three properties of CAP theorem i.e. Consistency, Availability and Partition Tolerance.
- Consistency: Every read receives the most recent write or an error.
- Availability: Every request receives a (non-error) response – without the guarantee that it contains the most recent write.
- Partition tolerance: The system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes.
- Consistency: If you are going develope a transaction based system/project, relational database system then consistancy is more important than availablity.
- Availability: If you are going develope a report dashboard system, analytics system, big data project, nosql database then availablity is more important than consistancy.
Note:
- So it’s upto your requirement which combination you have to choose among consistancy, availablity, partition tolerance.
- Generally, RDBMS focus on consistancy while NoSQL database focus on Availability.
In order to maintain consistency in a database, before and after transaction, certain properties are followed, called ACID properties.
- Atomicity: It states that either all or none.
- Consistency: The database must remain in a consistent state after any transaction.
- Isolation: It ensures that multiple transactions can occur concurrently without leading to inconsistency of database state.
- Durability: The database should be durable enough to hold all its latest updates even if the system fails or restarts.
Examples: Almost all RDBMS like MySQL, Oracle, MSSQL Server, Postgre SQL etc follows ACID Properties.
In order to maintain high availablity of a database, certain properties are followed, called BASE properties.
- Basic Availability: The database system does gaurantees their availablity most of the time. i.e. Database System is Highly Available.
- Soft-state: It indicates that the state of the system may change over time, even without input. This is because of the eventual consistency model.
- Eventual consistency: It indicates that the system will become consistent over time, given that the system doesn’t receive input during that time.
Examples: Almost all NoSQL databases like MongoDB, DynamoDB etc follows BASE Properties