Introduction: Today, data generation is the simplistic task that can be done by organizations but making the most of the data that is getting generated is still elusive. It is quintessential to possess the right information to sustain ultra-competitive business environments. In such a game-changing environment, there is a definite need for technologies, tools that can give us the power to generate that sense. To answer this question comes a tool from Microsoft along with the SQL Server package which provides features and the like to achieve just that, and the solution is an offering from Microsoft (SSRS). In this article, we will go through these features and functionalities one by one and to the possible detail. Table of Contents What is SSRS? SSRS for End users Overview of features SSRS Additional Features Report Usage Types SSRS in the Report Development Life Cycle Editions of Reporting Services: How is SSRS licensed? Conclusion What Is SSRS? SSRS is the Microsoft’s version to business reporting. SSRS stands for SQL Server Reporting Services and it is a server based report generation software that has been developed by Microsoft. SSRS provides a unified, server based, extensible, and scalable platform through which all the business reporting needs can be satisfied. It extends its scope from the present paper-based reporting to interactive and web-oriented reporting content. This reporting content can further be shared to various users through emails, file shares etc. for maximum reach. SSRS has the capability to generate reports of interest in various file formats such as HTML (Hypertext Markup Language), Microsoft Excel or CSV (Comma Separated Values) format etc. In addition to all these mediums, SharePoint can be used as a front-end to such reports and then be pushed to corporate portals for regular access. SSRS is just another tool that comes along with Microsoft Business Intelligence (BI) services of the platform. SSRS along with other components of the Business Intelligence platform provides the most sophisticated enterprise data analysis. The Microsoft Business Intelligence suite consists of the following: What is SSRS Microsoft SQL Server: This is a traditional database engine that stores the SSRS catalog data along with the business data. SSAS (SQL Server Analysis Services): This can be considered a very powerful tool that finds its usage in the online analytical processing (OLAP) and data mining. OLAP helps perform data aggregation in order to look through the dimensions of data whereas Data mining helps in discovering patterns available in the data. SSIS (SQL Server Integration Services): This is a component that helps in extracting data, transforming it as per need and also in loading ETL data. The SSRS tool provides an interface into Microsoft’s Visual Studio to enable developers and SQL Database administrators to connect to SQL database and prepare SQL reports in different ways. There is also a Report Builder tool that is available with the package that comes in handy for less technical users to format SQL reports in standard formats. There are tools and other business intelligence tools like Crystal Reports that SSRS competes with in this line of technology. Learn how to use MSBI, from beginner basics to advanced techniques, with online video tutorials taught by industry experts. Enroll for Free MSBI Training Demo! SSRS for End Users: From the offerings that Microsoft Business Intelligence suite provides, SSRS is definitely a unique offering as it caters to wide variety of users. On a broader sense, Microsoft has classified these diverse set of users into: Information Consumers Information Explorers and Analysts. From this classification, any individual can definitely vouch form the maximum usage being in the first category of users – the Information Consumers. Data that is generated or already available will always be consumed, hence the maximum user base will be with this user group. Information Consumers use the static, predefined and formatted data that is available. Information explorers forms the next bigger user group of users who would be interested to interact with the reports to some degree as in applying some custom filters or to drill down the available data to certain levels. This would definitely require some technical expertise but doesn’t restrict to only technical skills. Finally comes the Analysts, the smallest user group of users who can develop reports and also perform some sophisticated calculations such as linear regressions or trend analysis etc. Analysts requires more on the technical expertise to cater to all of these reporting needs and also to satisfy the most critical and complex reporting requirements. It can be thus said that the reports that are generated by the Analysts will become the input to both Information Explorers and Information Consumers. To cater to the various needs of the users, SSRS does provide the following tools for specific usages. These tools find its usages based on the user’s perspective of the reports: Report Viewer: As the name suggests, this is the module that you would be interested in for viewing your reports over the Web. Information Consumers should be very keen on using this as Report Manager is setup by SSRS for this very need. There is a provision for the developers to embed a Report Viewer control in ASP.NET and Windows Forms applications too. The latter method provides a hook to embed these into the web pages or .NET applications. Report Builder: This is a tool that is provided with a user friendly UI to cater to all the ad hoc reporting needs. This is set up against a SQL Server or an Analysis service database for the reporting requirements. As the name suggests, this would be the tool that the Information Explorers be keen in working with. Unlike most of the other ad hoc reporting tools, here there is no expectation of SQL knowledge. The reports can be generated by users without the Structured Query Language (SQL) knowledge or without any understanding of complex joins etc. Report Designer: This tool provides all the required hooks to generate complex reports. This is the forte of Analysts and this is where they kick in to action. Though most of the reporting requirements be handled by Report Builder itself, this is created to take on really complex reports. Overview of Features: There a numerous set of features that are provided by the SSRS offering of the Microsoft’s Business Intelligence (BI) suite. These features address the complex business reporting requirements and needs. Here is the brief overview of the features that are provided by the SSRS offering of the Microsoft’s Business Intelligence (BI) suite: SSRS is definitely a fully featured report engine and reports can be created or generated against any possible data source which has a managed code provider such as an OLE DB or an ODBC data source. This translates to saying that the data retrieval part will be able to retrieve data from SQL Server, Oracle, Analysis services, Access, Essbase and the like. Data can be presented in multiple ways and hence with every release Microsoft ensured that the feedback taken is put to greater use. There are these new Chart and Gauge controls and a Tablix control which provides an amalgam of Table plus Matrix controls. Apart from these, there were new presentation formats as like Word and Excel included and also provided direct integration with SharePoint. Let us now take a look at the features that are provided by SSRS offering: Able to retrieve data from managed providers with OLE DB and ODBC connections Ability to display data in tabular, free form and charts Ability to export data in many formats as like HTML, PDF, XML, CSV, Word and Excel Ability to aggregate and summarize data Ability to add in report navigation Ability to create an ad hoc report and save it to a server Ability to create custom controls using a report processing extension Ability to embed images, graphics and external content Ability to integrate with SharePoint Provision of Simple Object Access Protocol (SOAP) Application Programming Interface (API), pluggable architecture Provision to subscription based reports, on demand reports Ability to store and manage reports generated by the Users with SSRS’s report builder tool Provision to URL based report access Ability to display KPI data using Gauge and Chart controls Over and above all of these features, there is this icing on the cake. The extensibility that has been provided by SSRS, to be able to embed reports or generate reports customized based on your needs – it is a wonderful feature that came beyond the developer’s anticipation. Frequently Asked MSBI Interview Questions & Answers SSRS Additional Features 1) IIS REMOVED IN 2008 As there is a conflict to other applications they removed the dependency of IIS in 2008 by replacing with the components SAL OS SQL CLR SQL NETWORK INTERFACE COMMUNICATIONS THROUGH http.SIS protocol 2) RICH MEMORY MANAGEMENT ADDED a) Server Infrastructure for process memory monitoring Dynamic, sel 7 – managing with Memory pressure Reduces through put in memory pressure situations b) Reporting Processing uses a file system caches t adapter to memory pressure Receives memory events from server c) Administrator is able to set targets (min, Max) Minimum threshold defines the amount of memory the server thinks ‘belongs’ to it The memory is only used if a requests need it. Maximum threshold defines that not to exceed value. d) Adapts to other processes consuming memory. 3) RS 2008 REPORT ENDING CHANGES a) Report processing On – demand processing Hierarchical cursor – based object model. b) Rendering New rendering architecture Renderer rewrites 4) SCALABILITY a) Reports in SQL: 2005 are memory bound Memory usages is proportional to data size. Large datasets can cause out of memory exceptions Memory usage in problem renderers (pdf. Excel, csv) b) Very large reports can starve (or) fail many smaller reports. 5) DUNDAS ACQUISTION The SQL SERVER reporting services team has acquired dunda’s software data visualization products. Chart, guage, map, barcode and calendar for reporting Chart, guage for share point Chart program / enterprise, guage, map, OLA P chart and tab controls for visual studio (windows and web) 6) TABLIX – NEW DATA REGION a) Tablix provides a combination of the best features of tables and matrix data regions b) Build versatile reports c) Allows a flexible layout with multiple Row and column [Related Article Page: SQL Server 2008 R2 – LookUp Enhancement In SSRS] HTTP Listener It monitors the incoming request directed to HTTP.sis on a specific code on the local computer the host name and port are specified on a URL resection while you configure the server. When the HTTP listener process the request it forwards the application layer to verify the user ident Authentication Layer: It verifies the user id, password (Or) the identity of user (or) application that makes the request. The following authentication that supported are Windows Integrated security NTLM Authentication Forms Authentication Basic Authentication Anonymous Access Report Server It is the heart of reporting services which is implemented as windows server. It consists of Windows service Report manager Web service Back ground processing a) Windows Service: (provides report scheduling & delivery services):- Both the services are used in designing, saving, executing, managing and publishing the reports Reporting services hosts the report manager, the reports report server, web service & background features in their own service. b) Report Manager: It provides client fronted access the report server Items and their management c) Web service: It provides access to report server via report builder. d) Back ground processing: There are many processing have in this background processing Reporting Processing Data processing Model processing Data Rendering Data authentication extensions Scheduling Subscription Data base maintenance Report processing: Report server has 2 cool processors a) Report processor b) Scheduling and delivering processor Report server back end: Report server stores folders and files just like file system. The Report you create exists as a file in the files with extension “.RDC” (Report definition language). When the report is published it will be stored in the report server database The deployment uses a2SQL server relational data bases for internal storage Report server temp db à It stores temporary data session information and caching information Data processing: It is designed to retrieve a specific type of data source and provide extended functionality during report design and processing. Data Rendering extensions There are three rendering formats available a) Data render: Data only display Eg:– CST & XML b) Soft page break render :- Maintain format & layout EG:– Msword, excel, MHTML, Report viewer controls etc c) Hard page break render:- It supports gif & pdf formats Scheduling Delivery Extensions Report server Email Report server file share. Custom extensions Subscriptions Simple SSRS Architecture: Reports are required in general but situations a) For Internal Reports b) For external Reports Internal Reports: Generally, company Internal Operations such as pay slip, salary slip, Relieving letter, Internal Audition etc. External Reports: This reports generally submitted to 3rd party authorities such as IT department, STPI etc… Conclusion: To create this type of reports we go for different reporting applications like cog Nos, BO, SSRS, CRYSTAL REPORTS, MICROSTAR tegy etc [Related Article Page: What Is SSRS (SQL Server Reporting Services)] Report Usage Types Standard Reporting: Here there will be a centralized database. Multiple users connect to database and they generate the own reports Adhoc Reporting: This report also can be called as dynamic reports and the content and layout changed every time. Embedded Reporting: Here the reports are embedded with the 3rd party applications like Java, net etc. SSRS in the Report Development Life Cycle: To understand the ways how SSRS can be used or deployed, you should have the perfect understanding on its lifecycle works. It also helps if there is a better understanding on what features comes to your rescue in what stage of the report development life cycle. To keep it a bit simple, any typical reporting application goes through the following three stages – Authoring, Managing and Delivery. There are tools that help through these stages for any reporting application. With this understanding, let us now take a look into each of these stages to gain the best knowledge on the same. Authoring: Authoring stage denotes the stage within the report development life cycle where the report author defines the report layout and sources of data. Reports can be designed using either the Report Designer tool or Report Builder 1.0 depending on the release of SQL Server that you would be using. There is also the new version of Report Builder 2.0 that fits very nice into the areas where analysts would be interested in. Managing: Managing stage denotes the stage within the report development life cycle where the author publishes a specified report to a centralized location where a report administrator scrutinize for security and delivery. Once the report is published, an administrator can use Report Manager or SharePoint or SQL Server Management studio to manage these published reports. SSRS passes both the load tests as in scaling users from a single user to around thousands of users and the uptime, reliability maintenance. Delivery: Delivery is the stage within the report development life cycle where the actual report gets distributed to the intended users and is also available in many different formats (SSRS retrieval mechanism kicks in to enable users to change the output format of the requested report). SSRS provides a wide range of delivery methods ranging from emails, interactive online (usually on SharePoint or custom applications), printer or on file system. Reports are structured as specific items under a folder which further enables easier browsing and for quicker execution. Editions of Reporting Services: SSRS (SQL Server Reporting Services) comes in 4 different editions which mimic the 4 editions of SQL Server or Visual Studio, namely Express edition, Workgroup edition, Standard edition and last but not the least is the Enterprise edition. These editions are as expected range from free editions to fully scalable Enterprise editions. Let us see in some more detail about these editions, shall we? Express Edition: Express edition provides a light weight of SSRS for the developers to use it on need basis. There will be limited features to what are present on a full version of SSRS along with SQL Server. Workgroup Edition: Workgroup edition is ideal for a smaller group of individuals or branch offices where the load is limited and the features used are also limited. Should there be a need to scale up a setup with the Workgroup edition, there is always a scope for that instance to be upgraded from Workgroup edition to either a Standard edition or an Enterprise edition. Standard Edition: Standard edition of this tool is well versed or suited for small to medium organization or a single server environment. The only two features that the Standard edition of SSRS are specialized data driven subscriptions, infinite drill down using the Report Builder. Enterprise Edition: Enterprise edition of this tool is well suited for bigger organizations with more complex databases and also more complex reporting requirements. Enterprise edition covers all the major features of SSRS and also supports scaling across a web farm. How Is SSRS Licensed? The simplistic answer that one could provide here is that any machine that runs Microsoft SQL Server is licensed not just for the database engine but also gets listed for the entire Microsoft Business Intelligence (BI) platform. This means that it is licensed for Microsoft SQL Server, SSRS, SSAS, SSNS and also SSIS at once with just one license. This gives one an opportunity to work with SSRS without actually worrying about anything else. Currently there are 3 different ways to license an SQL Server installation. To gain more specific details on this can be achieved and procured, it is suggested to contact the Microsoft representatives or resellers. Per processor: In the per processor method of licensing, a License is paid for each processor on the machine that runs a SQL Server instance. This method of licensing is an optimal way for web-facing or the business to business machines running SQL Server. This will be very much helpful for huge user populations. Server license plus device client access licenses (CALs): License cost is paid only for the machine that runs SQL Server and at the same time for each and every other device that connects to this SQL Server instance. An ideal case where this model of licensing can be applied is Kiosks were there are multiple users per device. Server license plus user CALs: License cost in this model is paid for the machine that runs SQL Server instance and also on a user basis accessing the machine. This is very useful in the cases of enterprises in which each and every user can access the SQL Server machine for many devices at once. Conclusion: In this article we have gone through the concepts of SSRS (SQL Server Reporting Service) and also understood the circumstances or scenarios where it finds its usage. We have also understood the features that SSRS provides to its end users. We also have discussed how we can leverage different SSRS features in the report development life cycle. We have gone through few more specifics of it as like the editions and the licensing details in more detailed manner. With this article, we have tried to provide as many details as possible. We hope that this article should provide all the details that you would require if you are interested in the whole feature of SSRS. Please provide us the feedback on this article in the form of comments or suggestions. Though we have put in lots of effort in providing the most accurate details possible, we would still request you to please contact the Microsoft documentation as well, if you are willing to make any purchase decisions. Explore MSBI Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now! List of Related Microsoft Certification Courses: SSIS Power BI SSAS SQL Server SCCM SQL Server DBA SharePoint BizTalk Server Team Foundation Server BizTalk Server Administrator
Data is defined as facts or figures, or information that’s stored in or used by a computer. The technological advancements and usage of smart devices in recent years has led to data revolution. More and more data is being produced by an increasing number of electronic devices by using the internet. The amount of data and the frequency at which data is growing is very vast. Due to this rapid data growth, the computation has become a big hindrance. To process this extensive data, we require a higher computational power that our traditional data processors fail to do. To have a glance over data growth, let's consider this analytics - we create 2.5 quintillion bytes of data a day according to IBM. To tackle this data processing problem, we need a solution which could solve all data-related issues. This led us to develop a software platform known as Hadoop. Today, Hadoop is helping us in solving the big data problems. What is Hadoop Hadoop is an open-source software Platform for storing huge volumes of data and running applications on clusters of commodity software. It gives us the massive data storage facility, enormous computational power and the ability to handle different virtually limitless jobs or tasks. Its main core component is to support growing big data technologies, thereby support advanced analytics like Predictive analytics, Machine learning and data mining. Hadoop has the capability to handle different modes of data such as structured, unstructured and semi-structured data. It gives us the flexibility to collect, process, and analyze data that our old data warehouses failed to do. Hadoop Ecosystem Overview Hadoop ecosystem is a platform or framework which helps in solving the big data problems. It comprises of different components and services ( ingesting, storing, analyzing, and maintaining) inside of it. Most of the services available in the Hadoop ecosystem are to supplement the main four core components of Hadoop which include HDFS, YARN, MapReduce and Common. Hadoop ecosystem includes both Apache Open Source projects and other wide variety of commercial tools and solutions. Some of the well known open source examples include Spark, Hive, Pig, Sqoop and Oozie. As we have got some idea about what is Hadoop ecosystem, what it does, and what are its components, let’s discuss each concept in detail. Below mentioned are the concepts which all together can construct a Hadoop ecosystem. Let's get into the details without wasting much time. Table of Contents HDFS(Hadoop distributed file system) YARN MapReduce APACHE SPARK Hive H Base H Catalogue Apache Pig Apache Sqoop Oozie Avro Apache Drill Apache Zookeeper Apache Flume Apache Ambari HDFS(Hadoop distributed file system) The Hadoop distributed file system is a storage system which runs on Java programming language and used as a primary storage device in Hadoop applications. HDFS consists of two components, which are Namenode and Datanode; these applications are used to store large data across multiple nodes on the Hadoop cluster. First, let’s discuss about the NameNode. NameNode: NameNode is a daemon which maintains and operates all DATA nodes (slave nodes). It acts as the recorder of metadata for all blocks in it, and it contains information like size, location, source, and hierarchy, etc. It records all changes that happen to metadata. If any file gets deleted in the HDFS, the NameNode will automatically record it in EditLog. NameNode frequently receives heartbeat and block report from the data nodes in the cluster to ensure they are working and live. DataNode: It acts as a slave node daemon which runs on each slave machine. The data nodes act as a storage device. It takes responsibility to serve read and write request from the user. It takes the responsibility to act according to the instructions of NameNode, which includes deleting blocks, adding blocks, and replacing blocks. It sends heartbeat reports to the NameNode regularly and the actual time is once in every 3 seconds. YARN: YARN (Yet Another Resource Negotiator) acts as a brain of the Hadoop ecosystem. It takes responsibility in providing the computational resources needed for the application executions YARN consists of two essential components. They are Resource Manager and Node Manager Resource Manager It works at the cluster level and takes responsibility oforrunning the master machine. It stores the track of heartbeats from the Node manager. It takes the job submissions and negotiates the first container for executing an application. It consists of two components: Application manager and Scheduler. Node manager: It works on node level component and runs on every slave machine. It is responsible for monitoring resource utilization in each container and managing containers. It also keeps track of log management and node health. It maintains continuous communication with a resource manager to give updates. MapReduce MapReduce acts as a core component in Hadoop Ecosystem as it facilitates the logic of processing. To make it simple, MapReduce is a software framework which enables us in writing applications that process large data sets using distributed and parallel algorithms in a Hadoop environment. Parallel processing feature of MapReduce plays a crucial role in Hadoop ecosystem. It helps in performing Big data analysis using multiple machines in the same cluster. How does MapReduce work In the MapReduce program, we have two Functions; one is Map, and the other is Reduce. Map function: It converts one set of data into another, where individual elements are broken down into tuples. (key /value pairs). Reduce function: It takes data from the Map function as an input. Reduce function aggregates & summarizes the results produced by Map function. Apache Spark: Apache Spark is an essential product from the Apache software foundation, and it is considered as a powerful data processing engine. Spark is empowering the big data applications around the world. It all started with the increasing needs of enterprises and where MapReduce is unable to handle them. The growth of large unstructured amounts of data increased need for speed and to fulfill the real-time analytics led to the invention of Apache Spark. Spark Features: It is a framework for real-time analytics in a distributed computing environment. It acts as an executor of in-memory computations which results in increased speed of data processing compared to MapReduce. It is 100X faster than Hadoop while processing data with its exceptional in-memory execution ability and other optimization features. Spark is equipped with high-level libraries, which support R, Python, Scala, Java etc. These standard libraries make the data processing seamless and highly reliable. Spark can process the enormous amounts of data with ease and Hadoop was designed to store the unstructured data which must be processed. When we combine these two, we get the desired results. Hive: Apache Hive is a data warehouse open source software built on Apache Hadoop for performing data query and analysis. Hive mainly does three functions; data summarization, query, and analysis. Hive uses a language called HiveQL( HQL), which is similar to SQL. Hive QL works as a translator which translates the SQL queries into MapReduce Jobs, which will be executed on Hadoop. Main components of Hive are: Metastore- It serves as a storage device for the metadata. This metadata holds the information of each table such as location and schema. Metadata keeps track of data and replicates it, and acts as a backup store in case of data loss. Driver- Driver receives the HiveQL instructions and acts as a Controller. It observes the progress and life cycle of various executions by creating sessions. Whenever HiveQL executes a statement, driver stores the metadata generated out of that action. Compiler- The compiler is allocated with the task of converting the HiveQL query into MapReduce input. A compiler is designed with the process to execute the steps and functions needed to enable the HiveQL output, as required by the MapReduce. H Base: Hbase is considered as a Hadoop database, because it is scalable, distributed, and because NoSQL database that runs on top of Hadoop. Apache HBase is designed to store the structured data on table format which has millions of columns and billions of rows. HBase gives access to get the real-time data to read or write on HDFS. HBase features: HBase is an open source, NoSQL database. It is featured after Google’s big table, which is considered as a distributed storage system designed to handle big data sets. It has a unique feature to support all types of data. With this feature, it plays a crucial role in handling various types of data in Hadoop. The HBase is originally written in Java, and its applications can be written in Avro, REST, and Thrift APIs. Components of HBase: There are majorly two components in HBase. They are HBase master and Regional server. a) HBase master: It is not part of the actual data storage, but it manages load balancing activities across all RegionServers. It controls the failovers. Performs administration activities which provide an interface for creating, updating and deleting tables. Handles DDL operations. It maintains and monitors the Hadoop cluster. b) Regional server: It is a worker node. It reads, writes, and deletes request from Clients. Region server runs on every node of Hadoop cluster. Its server runs on HDFS data nodes. H Catalogue: H Catalogue is a table and storage management tool for Hadoop. It exposes the tabular metadata stored in the hive to all other applications of Hadoop. H Catalogue accepts all kinds of components available in Hadoop such as Hive, Pig, and MapReduce to quickly read and write data from the cluster. H Catalogue is a crucial feature of Hive which allows users to store their data in any format and structure. H Catalogue defaulted supports CSV, JSON, RCFile,ORC file from and sequenceFile formats. Benefits of H Catalogue: It assists the integration with the other Hadoop tools and provides read data from a Hadoop cluster or write data into a Hadoop cluster. It allows notifications of data availability. It enables APIs and web servers to access the metadata from hive metastore. It gives visibility for data archiving and data cleaning tools. Apache Pig: Apache Pig is a high-level language platform for analyzing and querying large data sets that are stored in HDFS. Pig works as an alternative language to Java programming for MapReduce and generates MapReduce functions automatically. Pig included with Pig Latin, which is a scripting language. Pig can translate the Pig Latin scripts into MapReduce which can run on YARN and process data in HDFS cluster. Pig is best suitable for solving complex use cases that require multiple data operations. It is more like a processing language than a query language (ex:Java, SQL). Pig is considered as a highly customized one because the users have a choice to write their functions by using their preferred scripting language. How does Pig work? We use ‘load’ command to load the data in the pig. Then, we can perform various functions such as grouping data, filtering, joining, sorting etc. At last, you can dump the data on a screen, or you can store the result back in HDFS according to your requirement. Apache Sqoop: Sqoop works as a front-end loader of Big data. Sqoop is a front-end interface that enables in moving bulk data from Hadoop to relational databases and into variously structured data marts. Sqoop replaces the function called ‘developing scripts’ to import and export data. It mainly helps in moving data from an enterprise database to Hadoop cluster to performing the ETL process. What Sqoop does: Apache Sqoop undertakes the following tasks to integrate bulk data movement between Hadoop and structured databases. Sqoop fulfills the growing need to transfer data from the mainframe to HDFS. Sqoop helps in achieving improved compression and light-weight indexing for advanced query performance. It facilitates feature to transfer data parallelly for effective performance and optimal system utilization. Sqoop creates fast data copies from an external source into Hadoop. It acts as a load balancer by mitigating extra storage and processing loads to other devices. Oozie: Apache Ooze is a tool in which all sort of programs can be pipelined in a required manner to work in Hadoop's distributed environment. Oozie works as a scheduler system to run and manage Hadoop jobs. Oozie allows combining multiple complex jobs to be run in a sequential order to achieve the desired output. It is strongly integrated with Hadoop stack supporting various jobs like Pig, Hive, Sqoop, and system-specific jobs like Java, and Shell. Oozie is an open source Java web application. Oozie consists of two jobs: 1. Oozie workflow: It is a collection of actions arranged to perform the jobs one after another. It is just like a relay race where one has to start right after one finish, to complete the race. 2. Oozie Coordinator: It runs workflow jobs based on the availability of data and predefined schedules. Avro: Apache Avro is a part of the Hadoop ecosystem, and it works as a data serialization system. It is an open source project which helps Hadoop in data serialization and data exchange. Avro enables big data in exchanging programs written in different languages. It serializes data into files or messages. Avro Schema: Schema helps Avaro in serialization and deserialization process without code generation. Avro needs a schema for data to read and write. Whenever we store data in a file it’s schema also stored along with it, with this the files may be processed later by any program. Dynamic typing: it means serializing and deserializing data without generating any code. It replaces the code generation process with its statistically typed language as an optional optimization. Avro features: Avro makes Fast, compact, dynamic data formats. It has Container file to store continuous data format. It helps in creating efficient data structures. Apache Drill : The primary purpose of Hadoop ecosystem is to process the large sets of data either it is structured or unstructured. Apache Drill is the low latency distributed query engine which is designed to measure several thousands of nodes and query petabytes of data. The drill has a specialized skill to eliminate cache data and releases space. Features of Drill: It gives an extensible architecture at all layers. Drill provides data in a hierarchical format which is easy to process and understandable. The drill does not require centralized metadata, and the user doesn’t need to create and manage tables in metadata to query data. Apache Zookeeper: Apache Zookeeper is an open source project designed to coordinate multiple services in the Hadoop ecosystem. Organizing and maintaining a service in a distributed environment is a complicated task. Zookeeper solves this problem with its simple APIs and Architecture. Zookeeper allows developers to focus on core application instead of concentrating on a distributed environment of the application. Features of Zookeeper: Zookeeper acts fast enough with workloads where reads to data are more common than writes. Zookeeper acts as a disciplined one because it maintains a record of all transactions. Apache Flume: Flume collects, aggregates and moves large sets of data from its origin and send it back to HDFS. It works as a fault tolerant mechanism. It helps in transmitting data from a source into a Hadoop environment. Flume enables its users in getting the data from multiple servers immediately into Hadoop. Apache Ambari: Ambari is an open source software of Apache software foundation. It makes Hadoop manageable. It consists of a software which is capable of provisioning, managing, and monitoring of Apache Hadoop clusters. Let's discuss each concept. Hadoop cluster provisioning: It guides us with a step-by-step procedure on how to install Hadoop services across many hosts. Ambari handles configuration of Hadoop services across all clusters. Hadoop Cluster management: It acts as a central management system for starting, stopping and reconfiguring of Hadoop services across all clusters. Hadoop cluster monitoring: Ambari provides us with a dashboard for monitoring health and status. The Ambari framework acts as an alarming system to notify when anything goes wrong. For example, if a node goes down or low disk space on node etc, it intimates us through notification. Conclusion: We have discussed all the components of the Hadoop Ecosystem in detail, and each element contributes its share of work in the smooth functioning of Hadoop. Every component of Hadoop is unique in its way and performs exceptional functions when their turn arrives. To become an expert in Hadoop, you must learn all the components of Hadoop and practice it well. Hope you gained some detailed information about the Hadoop ecosystem. Happy learning! List of Other Big Data Courses: Hadoop Adminstartion MapReduce Big Data On AWS Informatica Big Data Integration Bigdata Greenplum DBA Informatica Big Data Edition Hadoop Hive Impala Hadoop Testing Apache Mahout
Introduction: In this topic, let us try to understand the intrinsic details of what database normalization is and also, at the same time, we will try to understand the concepts of T-SQL in Microsoft SQL Server. Though the first concept is more relevant to almost all the relational databases available in the market today, the second concept that we would be discussing here would be very much specific to Microsoft’s SQL Server database alone. Let us go through these two concepts in detail and understand these to the core. Table of Contents What is Normalization? What are the types of Normalization? What is Transact-SQL? T-SQL features What is T-SQL used for? What is T-SQL Vs SQL? Advantages of Normalization and T-SQL Conclusion: What is Normalization? To understand the term normalization in a generic sense, it means to bring something to a normal condition or state. But this term has a very specific meaning for it in the areas of Databases. Hence, let us try to understand the concept of Normalization in the context of databases. Database normalization is the process of reorganizing data in a relational database in accordance with the series of so-called normal forms in order to reduce the amount of redundant data and also, at the same time, to improve upon the data integrity. The two principles that govern the process of Database Normalization are: There is no redundant data available (all the data is stored only at one place) Data dependencies should be logical (all related data should be stored together) Database normalization was proposed and invented by Edgar F Codd and is an integral part of his relational model. He is also considered the Father of the relational data model. Almost all the relational databases engines that exist today still use the rules laid by him. As per the rules laid by him, he had extracted the first 3 Normal Forms (acronym as NF) – namely 1NF, 2NF and 3NF. He had proposed the theory of Normalization with the introduction of First NF and then continued / extended the theory with the Second and Third NFs. This theory had then been extended with the help of Raymond F Boyce to form the BCNF. We will discuss these in further detail in the following sections for better understanding. Learn how to use MSBI, from beginner basics to advanced techniques, with online video tutorials taught by industry experts. Enroll for Free MSBI Training Demo! What are the types of Normalization? With the understanding that we have gained through the sections that are covered till now, let us expand our horizon of understanding by going through each and every one of these normal forms. Let us now go through each and every one of these Normal Forms and also understand what it brings to the table when it is applied. 1. First Normal Form (1NF): The First Normal Form is achieved when each of the table cell contains only a single value, and also, each record needs to be unique. The first point seems to be clear enough that there can’t be more than one value assigned for each column in a database table. The second point is achieved by the usage of Primary Key (A Primary key is the single column of a table that uniquely identifies a database record.) A point to note here is that the Primary Key can be composed of more than one key. A Composite Key is the set of columns that can be used in conjunction with each other to uniquely identify a database record. These definitions would help in understanding the concepts of a table, database and etc. Please take a look at these if you feel are required: A Table can be defined as a set of data elements using database rows and columns A Table cell can be defined as the value when a specific table row and column intersect A Table record / Row is a specific set of values for a set of columns that define a table A Column is defined as a specific element of a table and a combination of such columns constitute a table 2. Second Normal Form (2NF): A table can be taken to the 2nd Normal Form only if it is fully compliant with the rules laid for being called 1st NF. The next rule that comes in to picture is that there should be one and only one column that should act as a Primary Key. The concept of Composite Key doesn’t allow us to a table to be in 2nd NF if it has a Composite Primary Key. In order to reduce the dependency of using a Composite Primary Key, the table which has a Composite Primary Key should be disintegrated into 2 different tables and use the dependencies via the newly created tables. Hence comes the concept of Foreign Key. A Foreign Key is a column that references the Primary Key of another table to uniquely connect your tables 3. Third Normal Form (3NF): A table can be taken to the 3rd Normal Form only if it is fully compliant with the rules laid for being called 2nd NF. The next rule that comes into picture is that there should be no transitive functional dependencies. A transitive functional dependency is that scenario when changing a non-key column may affect in changing any other non-key column values. Ideally, a table cannot go further than this level in the process of normalization, but for the sake of knowing things, we will go through the others as well. 4. Boyce-Codd Normal Form (BCNF): Generally, most of the scenarios won’t reach to this level, but there can still be anomalies resulted if there is more than one Candidate key. Sometimes, the BCNF form is also called as the 3.5NF. 5. Fourth Normal Form (4NF): If there are no database table instances containing two or more, independent and also multivalued data describing a relevant entity – it is then called to be in the 4th NF form. 6. Fifth Normal Form (5NF): Firstly, a table can be in the 5th NF only if it is compliant with all the rules laid for it to be called 4th NF. A database table should no longer be in a state to be decomposed to further tables without losing any data, only then it will be called to be in the 5th NF form. 7. Domain/Key Normal Form(DKNF): A domain/key normal form is a normal form used in the process of database normalization which requires that the database contains no constraints other than database constraints and key constraints. A domain constraint mentions the permissible values for a provided attribute A key constraint specifies the attributes that uniquely identifies a given row of a table A table is said to be DKNF compliant when every constraint on the relation is logical consequence of the definitions of keys and domains. Enforcing both these constraints are met ensures that there are no non-temporal anomalies existing in the database. 8. Sixth Normal Form(6NF): Generally speaking, 6th NF is not a standardized form of a database table There are still discussions happening amongst the database experts to be clear on what rules should it meet to be called a 6th NF form table. What is Transact-SQL? T-SQL, often abbreviated as Transact SQL, is a proprietary extension to SQL owned by Sybase and Microsoft SQL Server. T-SQL expands the horizons on the SQL standard by including more features than what SQL standards provide. T-SQL is much more centralized to SQL Server as such and most of the operations that are performed in SQL Server are done via T-SQL. This is very much true in case of GUI tools such as SSMS or DBeaver. From now, remember that any operation that you will and wish to do on Microsoft SQL Server on any given GUI tool, it is all T-SQL that runs in the background. T-SQL is proprietary in the case of Microsoft SQL Server and also Microsoft Azure SQL database. Transact SQL (T-SQL) is not the only extension to the already set SQL standard, there are many as such. T-SQL is an extension owned by Sybase & Microsoft SQL Server. PL/SQL is an extension owned by Oracle whereas PL/pgSQL is an extension that is owned by PostgreSQL folks. Although there are various advantages that you gain while using it, it makes the process of moving from one database system to another extremely difficult. T-SQL features: With the understanding that we have on T-SQL, let us deep dive into the features that it provides to us. Some of the features that T-SQL provides (which are rather not provided by SQL) are as follows: Procedural programming Local variables to provide more flexible control over the application flow. Various support functions for processing strings, dates and logical / mathematical functions. This helps T-SQL comply with the Turing completeness test (a test that determines the universality of any given computing language). Provides a specific implementation to DELETE and UPDATE than that of SQL as it allows FROM clause to be added. Once this is added, we will be able to use or include JOINs. Provides a way to use BULK INSERT to import lots of data into a database table or a view in a specified user format. T-SQL also allows you to have a greater grip on the programmability. An example to this is the use of stored procedures where you can alter the input parameters to have a changed output parameter as well. Frequently asked MSBI Interview Questions What is T-SQL used for? We have already discussed in the earlier sections that T-SQL is a proprietary version of SQL standards for Microsoft SQL Server just the same way as PL/SQL is for Oracle database. An individual can create the required T-SQL units such as any other SQL scripts, Triggers, Functions or Stored procedures etc. to cater to specific requirements. Hence, to cater specific requirements on SQL Server, there can be specific code written such as a Trigger to handle the audit on a specific table or something. Such requirements can be handled on the SQL Server instance itself using some SQL code, since SQL Server has its own notion of its syntax in the form of T-SQL – the database artifacts thus created would be called T-SQL pieces of code rather than SQL code. Mentioning it T-SQL, you’re drawing a definitive line of difference that it works only with SQL Server and may / may not work with the other database vendors – as there can be changes to the syntaxes, functionalities that are not made available for usage and the like. What is T-SQL vs SQL? Let us try to understand the differences between SQL & T-SQL in detail in this section here. Let us go through each point to understand the strengths of each of these that we are discussing here: SQL T-SQL SQL (Structured Query Language) can be defined as a language to operate over sets of data. T-SQL is a proprietary procedural language designed specifically for Microsoft SQL Server, and can also be said that T-SQL is an extension to SQL SQL is an open format, which has been followed by various database providers (eg. Oracle, Sybase, PostgreSQL, MySQL, MS SQL Server) T-SQL is proprietary to Microsoft SQL Server SQL is not Turing complete and it is very limited in the scope of what can be done with it. T-SQL is Turing complete SQL doesn’t have any procedural programming on its own T-SQL on the other hand contains procedural programming, local variables. SQL has its own implementation of DELETE and UPDATE T-SQL has its own implementation of DELETE and UPDATE than that of SQL. Advantages of Normalization and T-SQL: There are numerous advantages that are available for us to discuss here and we will just do that. Let us take a closer look at the advantages that Normalization provides us: Normalizing a database to the lowest possible level gives us a greater overall database organization It also ensures that there is no redundant data available It also ensures that there is data consistency within the database itself It also provides a better grip over the database security as well. Functional dependencies are very important and handled better during the normalization process itself. Provides ways and means for flexible database design. Normalization also ensures that there are smaller tables with smaller rows ensuring more rows per page (less logical I/O). Index searching gets quicker as the indexes tend to be narrower and shorter. Let us now take a closer look at the advantages that T-SQL provides us: T-SQL provides us ways and means to interact with SQL Server in a better way with SQL queries Stored procedures bring in the concept of doing multiple things under a single transaction context Triggers help in getting audit related stuff handled in a much efficient manner There are more and more database related artifacts that we can handle at the database level itself, rather than relying on gaining these features via other programming languages for SQL Server. The main advantage of T-SQL is that it provides better control over the database instance from the database instance level itself. Conclusion: In this article, we have tried to understand the concepts of Normalization and T-SQL (pertaining to MS SQL Server). We have also gone through the various available forms of normalization in a RDBMS system. We have tried to focus on the features that T-SQL has to offer and also have gone through the advantages that it brings to the table with its usage. On the whole, these two concepts put to use in an efficient manner will offer a greater performance to your application. We have taken utmost care to provide you best of the details, but we would still encourage you to go through the official documentation for the latest updates as things are getting changed on a quicker pace. Hope that you have got all the details at once and hope all the details provided were of good use as well. Explore SSAS Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now! List of Related Microsoft Certification Courses: SSIS Power BI SSRS SharePoint SSAS SQL Server DBA SCCM BizTalk Server Team Foundation Server BizTalk Server Administrator
Introduction: Today, businesses in almost every industry are under tremendous pressure to mine their own data, look for trends and patterns. Based on this, they are even required to predict the future for their own organizations. The tools that can achieve these classified as Business Intelligence (BI) tools, which can pull in all the raw data from various sources and have the caliber are to turn them into actionable insights. These tools come with advanced reporting and visualizations to analyze further in order to make better decisions. Getting the ideal Business Intelligence (BI) software for your organization plays a key factor in increasing your organization’s efficiency. In a quest to achieve that, we are bringing this article, and here, we would be discussing about two giant names in the industry of Business Intelligence (BI). These are none other than Tableau Desktop and Tibco Spotfire. We would be conducting a lot of analysis from various perspectives to understand what the best usage is for two in specific. This will be in specific related to the features that each of these tools bring to the table and also at the same time, how would it behave when they are actually put to test. Now, let’s have a look at the topics that we are going to explore in this article. Table of Contents What is Tableau Desktop? What is Tibco Spotfire? Overview of Tableau Desktop and Tibco Spotfire Conclusion What is Tableau Desktop? Tableau is a software company that’s headquartered in Seattle, Washington, United States. Tableau is dedicated to produce interactive data visualization products focused on Business Intelligence (BI). Tableau Desktop is a leading new generation Business Intelligence (BI) application that is also acclaimed as a ‘self-service’ data discovery tool which can be used to achieve your goals without any IT support. This is your tool to cater all your graphical or visualization needs than any other tool which would rather take longer durations to provide just the pie charts or bar graphs. NOTE: For more details about the product and / or the company refer here What is Tibco Spotfire? It is a Business Intelligence (BI) company based in Somerville, Massachusetts, United States. Spotfire was available from the 90’s but didn’t really take off until 2007 when it was acquired by the big brand, Tibco. An exact count of available customers is unavailable at the moment, but it is safe to say that Tibco has a larger product portfolio and also a larger market share. Spotfire, an offering from the Tibco suite of products, is a smart, flexible tool which provides data visualization and predictive analysis of business data. It also supports many other tools in conjunction to simplify your analyticcs journey providing simple dashboards, but yet with deeper insights. NOTE: For more details about the product and / or the company refer here Overview of Tableau Desktop and Tibco Spotfire: Now that we have a better understanding about the product and the company that owns these products, we will try to get a deeper look at each of these products to understand their strengths and weaknesses. We would be trying to cover the maximum ground taking on almost all the possible keys for comparisons between these two products. Based on your need, you could refer to the section that interests you the more. Though there are lots of efforts put up to bring in the best of the details available, we would still encourage the readers to go through the Official documentations before a purchase decision is made. This will ensure that you are going through an outdated information, if there’s a newer version available from either of the products. Let us now take a quick look at each of these factors based on which we are going to compare the products: 1. Tableau vs Spotfire: Pricing Comparison Price is definitely the factor that any organization would be keen in looking at. Considering that, we would be taking up this factor as the very first one to compare between these two products. Though there are different ways that these products deal with pricing and strategies, we will try to give details as accurate as possible. Want To Get Spotfire Training From Experts? Enroll Now For Free Demo On Spotfire Training Tableau Desktop: There are three ways that the price-scheme has been categorized for Tableau 1. One-time payment 2. Annual Subscription 3. Quote based Tableau Desktop (Personal Edition) is quoted to be priced at $35 per user per month Tableau Desktop (Professional Edition) is quoted to be priced at $75 per user per month Tibco Spotfire: There is only one way the price-scheme is maintained at Tibco, namely “Quote based”. It is quoted to be priced at $0.99 per hour Tibco Spotfire Desktop comes at a price of $650 (annual subscription) Tibco Spotfire Cloud comes at a price of $200 per month or $2000 per year subscription prices Tibco Spotfire Platform which comes in subscription, perpetual and term licenses based on needs and requirements from the organizations 2. Feature Comparison Between Tableau and Spotfire: Let us take a look at the features that are made available with the product itself. Based on these, you might choose the product that best suits your requirement. Tableau Desktop: Toggle view and drag drop feature List of native data connectors Highlight and filter data Share dashboards, embed dashboards within, dashboard comments, mobile ready dashboards, interactive dashboards Data notifications Tableau Reader for data viewing Create queries without any code Translate queries to visualizations Metadata management Automatic updates Server REST API Tableau Public for data sharing Tibco Spotfire: Big Data Analytics Content Analytics Predictive Analytics Location Analytics Event Analytics Data discovery and visualization Dashboards and analytic applications Advanced collaboration tools Check Out Tibco Spotfire Tutorials 3. Tableau vs Spotfire: Comparing Product Table Though Tableau Desktop and Tibco Spotfire cater to the same needs or requirements for an organization, there are differences in how these be handled by each tool in a different way. The product comparison will help you get a better idea on what can be expected from the tool itself. Let us take a closer look at the features of the products and let us compare against each other for a clearer picture. Tableau Desktop Tibco Spotfire Tableau Desktop provides a large set of visualizations Lesser number of visualizations provided as part of Tibco Spotfire’s offerings when compared with Tableau Desktop Tableau Desktop is designed to work on both Windows and Macintosh OS environments Tibco Spotfire is designed to work on both Windows and Macintosh OS environments Tableau Desktop software provides a wider range of graphical features like charts, bars, and graphs for pictorial depiction of the available data. Tibco Spotfire provides a competitive number of graphical features too. Tableau Desktop offers great data analytics features that can help your organizations take business-friendly decisions after careful analysis of past historical data Tibco Spotfire product offers a visual and also an interactive way which helps experts take necessary decisions based on the data underneath Tableau Desktop mainly offers Toggle view, Drag and drop features, a wholesome list of native data source connectors, Highlight and filter data based on conditions Tibco Spotfire enables Big Data Analytics, Content Analytics, Predictive Analytics, Location Analytics, Event Analytics, Data discovery and many other features like these Tableau Desktop provides an in-built integration with R, by which R models can be run soon after the integration with R gets completed One of the best tools to provide support and also integration with R and many other integrations with Tibco family of products Tableau Desktop is best suitable for Medium to Larger enterprises with intrinsic need of data Tibco Spotfire can be best suitable for Smaller to Medium enterprises with its offerings Tableau Desktop has started its support to Big Data analytics from Tableau 8.1 Tibco Spotfire is considered one of the best options if you are looking at Big Data Analytics as an option Tableau Desktop has added a new feature of storytelling with a new UI starting Tableau 8.1 There is no automatic storytelling feature available with Tibco Spotfire, and this is manual Tableau Desktop provides wonderful support towards PowerPivot There is no such support provided from Tibco Spotfire towards PowerPivot Tableau Desktop provides wonderful data drill down options to the maximum level possible The feature is competitive enough in comparison with Tableau Desktop software offerings Tableau Desktop has evolved into a mature product only from Tableau Desktop 8.2 Tibco Spotfire is considered a mature product with its offerings in the same space, and also in comparison with Tableau Desktop 4. Data Analytics: How does Tableau Compare to Spotfire? It’s a known thing that each of these products have the capabilities to connect to external data sources, but, having said that, Tableau has the capability to connect to more data sources than that of Tibco Spotfire. Spotfire has the ability to integrate data sets of different formats while connecting to an external data source – this isn’t something that Tableau can’t handle. Let us now take a look at the offerings from the products standpoint as well: Tableau Desktop: Users can manipulate and visualize on larger data sets Has the ability to connect to over 40+ data sources ranging from Microsoft Excel to Hadoop clusters, thus putting customers in a comfortable position in creating the reports of their requirement. Has the ability to visualize trends by picking up on the repeated patterns in the data sets. Tibco Spotfire: Has the ability to connect to over 20+ data sources and at the same time has the ability to work with various data formats. Built-in capability to perform statistical analysis and modelling right from its dashboard Ability to perform MATLAB, SAS, R or S+ functions from the UI to enable users base predictions on their calculations Frequently asked Tibco Spotfire Interview Questions 5. Difference Between Dashborads of Spotfire and Tableau Dashboards are the backbone of any Business Intelligence (BI) tool. Dashboards give users the ability to organize data sources, reports and the necessary artifacts at a centralized location so that they stay updated in real time always. 6. Tibco Spotfire VS Tableau: Customer Support Customer Support is the last important thing after a Customer chooses to make the purchase decision, right from installation to deployment and any cases where issues require handling post deployment. Customer Support team works with all the needed stake-holders to keep up the organization’s image always positive. In such a quest, let us now take a look at how each of these products fare in this category: Tableau Desktop: Provides 4 different ways of Customer Support – Complimentary, Technical Support, Elite program and the OEM program Non-critical and Non-technical issues such as software bugs, configuration related questions etc. come under the Complimentary Support. Technical Support covers critical issues and comes along with a one year license subscription. Elite program has a dedicated support manager who prioritizes the requests raised for any required Support. The OEM program assigns a Partner Support Engineer whoever integrates Tableau into their own existing software suites. Majority of the customers have all the good things to say about Tableau products. In fact the Gartner’s Magic Quadrant also reveals that Tableau is to that mark, sixth time this year. Tableau’s customer support has kept pace with its growth and, as a result, customers consistently give it a 100% satisfaction rating. Tibco Spotfire: Spotfire users may choose to visit the Tibco Support Central portal in order to submit their requests There are a good number of knowledge articles available for users to go through any specific scenarios as well. Provision to connect with other Spotfire users through the Spotfire community or forums. Tibco provides discounted training offerings in the form of educational passports 7. Tibco Spotfire VS Tableau: Power Users Some of the Power users for Tableau are SpaceX, Deloitte, Coca-Cola, Dell, Citrix, Pandora Some of the Power users for Spotfire are Proctor and Gamble, Cisco, NetApp, Shell Conclusion In this article, we have gone through the two variants of softwares that cater to similar needs in their own ways – namely Tableau Desktop and Tibco Spotfire. We have also gone through an extensive list of features and comparisons between them. Both of these are robust data analytics and visualization tools providing similar features to their end users, but choosing one from these two completely relies on the end users. Tableau caters to great visualizations whereas Spotfire caters to statistical data analysis with its built-in capabilities. Hope we were able to present all the details that one would require in the comparison of these two tools. Please do let us know if there are any suggestions to our articles. Explore Spotfire Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now! List Of Tableau Courses: Tableau Advanced Tableau Server Data Visualization and Dashboarding Fundamentals
In the software industry, it is a well - known fact that an operating system (OS) is the most important component of a computer. It is the primary software that manages all the software and hardware on a computer. There are different types of operating systems and Linux is one among them. In this Linux tutorial, we will start from the basics of linux and learn all the major linux concepts that a linux professional must be aware of. Now, let’s have a look at the components of this tutorial. What is Linux? Linux Vs. Windows What are the features of Linux? What are the advantages of Linux? What is the difference between Unix and Linux? How can I learn Linux? What are Linux commands and how you can use them? What is Linux Shell? Navigation Process Management Filesystem File Manipulation Permissions Linux Career Path - Jobs, Roles & Responsibilities, and Salary Packages What is Linux? The term “open source” originated in the software development context to designate a particular approach to the creation of computer programs. An open source OS basically refers to a type of computer software where its copyright holder grants users all over the world, the rights to distribute (to anyone for any purpose), change, and study the software. Linux is a community-developed and an open source operating system for servers, computers, mainframes, embedded devices, and mobile devices. Almost all the main computing platforms including SPARC, ARM, and x86 support linux, and this makes it one of the most widely supported operating systems. A linux distribution, also known as Linux distro, is a version of open source linux operating system and is packaged with various other components like management tools, installation programs, and additional software like KVM hypervisor. RHEL (Red Hat Enterprise Linux) from RedHat is one of the most popular linux distributions. RHEL is developed specifically for the business market. Linux Vs. Windows In this section, we will compare linux with another major operating system, Windows. Windows is a group of several OS families and each of its versions GUI (graphical user interface) with a desktop that enables users to view folders and files in this OS. Below here is the table which illustrates the comparison between Windows and Linux. Linux Vs Windows Windows OS Linux OS Windows OS is very easy to use and two of its major design characteristics are simplicity and user friendliness. An average user should gain some knowledge on how to use Linux OS. And, to perform day-to-day operations, users need an in-depth understanding of the underlying system. Windows is less reliable when compared with Linux. Linux is highly secure and reliable. It focuses more on uptime, system security, and process management. Majority of the Windows games, utilities, and programs are commercial. Majority of the Linux games, utilities, and programs are open source and free. Windows provides online and integrated help systems. Also, there are many books available for all skill levels. There is a massive online support available for Linux. This is through a large community of websites and user forums. Windows OS is usually used by novice users, gamers, and business users who depend on Microsoft software. Various academic, scientific, and corporate companies of every size use Linux. This OS is used to power servers and development machines at NASA, Twitter, Facebook, Google, and various other top organizations. Windows installation is very easy but it takes time. Linux OS installation involves complications but this OS can complete complex tasks faster. What are the features of Linux? Over the years, Linux has gained the reputation as a very efficient and fast performing system. Its features will tell you how effective this operating system is. Now, let’s explore the major features of Linux OS. Portability - This means software can work on various kinds of hardware in the same manner. Here, port means to alter the software and make it adaptable to function on a different system. Linux OS can run on any hardware environment. Free Software - Linux can be downloaded free from the Internet. Free updates, no costs per user, no registration fees, and freely available source code if you want to change your system’s behavior. Versatile and Secure - The security model used in Linux depends on the UNIX idea of security, and it is known to be of proven quality. This is the reason why many tasks are executed at night or are automatically scheduled for other calm moments, thus resulting in more availability in case of busier periods and the utilization of hardware in a more balanced way. Multi-User System - Linux is a multi-user system. This implies the system’s resources such as application programs, RAM, memory can be accessible by multiple users at the same time. Hierarchical File System - A standard file structure is provided by linux in which user files/files are arranged. Multiprogramming - Multiprogramming is supported by linux. Multiprogramming means there will be a provision for running multiple applications at the same time. [Related Article Page: Introduction To Linux Operating System] What are the advantages of Linux? The best assets of linux when compared with other operating systems are its reliability, price, and the freedom it provides you. Now, it’s time for us to gain some knowledge about the major advantages of Linux OS. Free This is one of the major advantages of linux as you can download most linux distributions freely from the web. Additionally, the linux distributions can be downloaded freely and installed legally on any number of computers and can be given freely to other people. Security Most viruses which attack an operating system are developed via Active X software framework, and linux doesn’t have this. The same principle applies to various other viruses like worms, Trojans, or spyware. Stability Linux systems are very stable and they won’t freeze up like other systems. Open Source You can add new features, delete things that you don’t like, and customize some features. You can do these because the source code is accessible to the public and they can make the relevant change and customize the software according to their requirements. Support for Programming Languages Almost all programming languages(Ruby, Perl, Java, C/C++, Python, etc.) are supported by Linux. It also offers many applications that are useful for programming. [Related Article Page: What Can I Do Using Linux?] What is the difference between Unix and Linux? Linux is a more famous OS than Unix in today’s world, but the latter has its own users. Unix is a family of multiuser, multitasking operating systems analogous to Windows and DOS, and this OS is used in workstations, internet servers, and PCs by several major organizations such as HP, Intel, Solaris, etc. In this section, we will have a look at key differences between Unix and Linux. Unix is a proprietary OS and it was developed by Bell Labs. It works primarily on CLI (Command Line Interface). Also, Unix isn’t as flexible as Linux and has less compatibility with different hardware types. Its installation requires a well-defined and strict hardware machinery and works only on particular CPU machines. The installation of Unix involves high cost when compared with Linux since the former requires special hardware and can run only on particular CPU processors. Unix OS is not portable and its distributions or versions are less in number when compared with Linux. The source code of Unix is not available freely and it supports fewer file systems when compared with Linux. Linux OS is based on Unix and it is basically a kernel which has a GUI (Graphical User Interface) like Windows OS. It also has a CLI and it’s optional. Unlike Unix, Linux can be downloaded, distributed freely. Also, there are priced distributions for Linux such as Red Hat Linux and they are generally cheaper than Windows. Linux is compatible with almost all hardware systems and is quite flexible. You can install and execute linux on anything that consists of a processor. The source code of Linux is available freely as this a free OS. When compared with Unix, Linux installation is highly economical. Linux is highly scalable and it supports a huge set of file systems. When compared with Unix, there are a number of Linux versions or distributions. [Related Article Page: Top Most Reasons To Use Linux] How can I learn Linux? The first thing to do for learning linux is to prioritize your needs. As linux comes with many distributions, you have to choose a distribution that suits your needs. For choosing a distribution, you must gain knowledge on all the distributions that are available. Then you must learn about desktop environments you can utilize along with your linux distribution. Get yourself familiarized with the applications that come with your operating system. You must learn how to use the Terminal as it is present in all the linux operating systems. You can download applications and files through the terminal. Search for the Software Center application that comes with your operating system. This application will help you locate new applications you can install. Using the Terminal, you can install applications that are not present in the Software Center. You have to get familiarized with the file system and learn where common directories are present. This way, you can learn linux. Inclined to build a profession as LINUX Developer? Enroll Now for Free Demo here! Linux Training What are Linux commands and how you can use them? If a user gives an instruction to the computer to perform a specific task, then that instruction is stated as a command. In this section, I will introduce you to basic yet highly important Linux CLI commands. These commands will provide you with a working knowledge of how to get around your linux terminal from the shell. rm - This command is used for removing files from your Linux OS. locate - Used for locating a file within the Linux OS. touch - Enables users to make files utilizing the Linux CLI. rmdir - Enables users to remove existing commands utilizing the Linux CLI. mkdir - Enables users to make a new directory. man - Utilized to display the manual of inputted command. mv - Used to move a file to another directory or folder. cd - Enables users to change between file directories. Is - Lists all the important directories filed under a given file system. What is Linux Shell? A shell is a special program in an operating system which takes commands from the keyboard and provides them to OS to execute. It was the only UI (User Interface) available on Linux. Today, in addition to CLIs like the shell, we have GUIs as well. A program known as bash functions as the shell program on most Linux systems. The other shell programs that we can install in a Linux system are zsh, tcsh, and ksh. A program known as terminal emulator opens a window and enables you to interact with the shell. Navigation For using a Linux system in an effective manner, you must navigate around the file system and gain in-depth knowledge of what is around you. For this, you need to have access to the Linux server. You should also have a basic understanding of the working of the terminal and how Linux commands look like. You also should configure the non-administrative, regular user account. With the help of pwd command, you can find where your home directory is in a relationship with the rest of the file system. You can see directories’ contents with the help of ls command. A command known as the cd is used for changing a directory by providing an absolute path. So, for exploring the Linux filesystem, you can use these 3 commands. Process Management Whenever a program is launched by Linux or by the user, a process is created by Linux. A process is actually a container of information about what is happening and how that program is running. If the Linux process runs and terminates perfectly, everything will be fine. However, if the process refuses to terminate when its time is up, or if it hogs the CPU, we can use a few Linux commands to help restore the functionality of the process. While managing a Linux process, you must observe which processes are running and see how much of Linux system each process is utilizing. Locate a specific process to view what it is doing and if any action is needed. You have to define or alter the priority level associated with that process. If the process is misbehaving, you can simply terminate it. Below here are few commands which can be used to manage Linux processes. They can be entered through the CLI. To access CLI, simply open a terminal window. top - Provides information on the currently existing processes. htop - This is like a top, but smarter and prettier. It presents information in a clearer format. ps - For listing running processes. pstree - Used for showing a tree diagram of Linux processes, and also the relationships existing between them. who - It will list out the users who are currently logged into the Linux system. kill - For terminating a process. Frequently asked Linux Interview Questions Filesystem All files and directories in Linux are located in a tree-like structure. The file system root is the topmost directory. This is the directory from which all the other directories are accessed and are arranged in a hierarchical structure. The key features of a Linux file system are specifying paths, drives, partitions, directories, mounting and unmounting, file extensions, case sensitivity, file system permissions, and hidden files. You can refer here for an in-depth explanation of these things. The file formats with which a standard Linux distribution offers the choice of partitioning a disk are as follows. ext2, ext3, and ext4 - These three file systems are the progressive version of ext (Extended Filesystem). ext was developed primarily for MINIX. While ext2 was an improved version, ext3 came up with an improvement in performance. The arrival of ext4 saw an improvement in performance along with a few additional features. Journaled File System (JFS) - JFS is a file system developed by IBM for its AIX OS. JFS keeps track of changes to folders and files in a log file. ReiserFS - ReiserFS is an alternative to ext3 with advanced features and improved performance. XFS - A high speed Journaled File System aimed at parallel I/O processing. Btrfs (B-Tree File System) - Focuses on large storage configuration, repair system, fun administration, and fault tolerance. File Manipulation While working on Linux OS, it’s important that we build a directory structure that will enable us to organize our data in a manageable way without wasting a lot of time searching for a particular file. In this section, I will let you know a few commands that are useful for file creation and manipulation. mkdir - used for the creation of a directory. rmdir - for removing a directory. touch - for the creation of a blank file. cp - for copying a directory or a file. mv - for moving a directory or a file. rm - for deleting a file. Permissions Though Linux OS has a lot of security features, there is a possibility of the existence of potential vulnerability when local access is granted. This implies that file permission based issues will come up if a user doesn’t assign correct permissions to directories and files. So, there is a need to assign permissions correctly and below here are the ways to assign permissions. There are 3 user based permission groups for each directory and file. They are given below. Group permissions - These permissions are applied only to the user group that is assigned to directory or file. The other users are not affected by these permissions. Owner permissions - These permissions are applied only to the owner of the directory or file. The actions of other users will not get impacted by these permissions. All Users permissions - These permissions are applied to every user existing on the system. There are 3 types of permissions for each directory or file. They are as follows. Write permissions - These permissions refer to the capability of a user to write or modify a directory or file. Read permissions - These permissions refer to the capability of a user to read the file contents. Execute permissions - These permissions affect the capability of a user to view the contents of a directory or execute a file. The permissions can be viewed by checking the directory or file permissions in your GUI File Manager. [Related Article Page: Linux Advanced Functions And Commands] Linux Career Path - Jobs, Roles & Responsibilities, and Salary Packages Linux is one of the emerging IT technologies and is an excellent opportunity for people who are looking for jobs on this platform. Its technological landscape reveals that it has a lot to offer in the coming years as well. Almost every reputed organization is looking for a Linux certified engineer as, nowadays, it has become quite difficult for them to find knowledgeable and experienced candidates in Linux for their company. The job titles that depend heavily on Linux skills python developer, DevOps engineer, system administrator, system engineer, Java developer, Linux administration jobs, C++ Developer, etc. Various Linux job roles are Linux Administrator, Linux Engineer, Linux Systems Administrator, Senior Software Engineer, etc. The roles and responsibilities of a Linux Administrator include maintaining, configuring, as well as installing the Linux workstations and servers. They are responsible for maintaining the health of servers and network environment. By complying with the company’s security standards, Linux administrators must make sure they provide solutions and support and solve user requests. They are also responsible for evaluating the software and hardware technologies and must remain updated with the knowledge of the Linux system. With many Linux jobs out there, the very next question that arises in our mind is the salary offered for these jobs. If a candidate is hired as a software developer or a system administrator without holding any certification, he/she is expected to draw $100000 a year. However, people having Comp TIA Linux Certification can earn 8 percent more than what others are earning in the same field. Thus, this linux tutorial will provide you with an understanding of what Linux is all about and help you gain in-depth knowledge on all the Linux concepts. Explore Linux Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now!
In almost every company, the potentially utilizable data is inaccessible; it was revealed in a study that 2/3rd of businesses either have little or no benefit from their data. The data remains locked in legacy systems, isolated silos, or scarcely used applications. Few people process ETL through programming in Java or SQL, but there are other tools available to make this process simple such as Talend. Let’s further discuss what actually ETL approach is and what impact it has on Talend. Table of Contents What is ETL? How ETL Works? ETL in Cloud Talend Data Integration Talend Open Studio (An ETL tool from Talend) Advantages of ETL tools Various categories of ETL Tools Future Scope of Talend ETL tool What is ETL? ETL is the abbreviation of Extract, Transform and Load. It extracts the data from different sources and converts it into a understandable format. This data is used for storing in a database and used for future reference. Extract involves the process of reading the data from a particular database which is collected from multiple sources. There are many storage systems where the data can be stored, some of them are XML files, Flat files, Relational Database management systems(RDBMS), etc. Transform converts the extracted data from its initial format to the required format. The various methods used for transforming the data are filtering, sorting, conversion, removing the duplicates and translating. Load is the final step of the ETL process which writes the data into the target database. How ETL Works? The data from multiple sources is extracted and this data is further copied to the existing data warehouse. When handling huge volumes of data and many source systems, the data is combined into a single data store. ETL is used to transfer data from an existing database to another database, This is the only process involved in loading the data to and from data warehouses and data marts. Representation of ETL Workflow ETL in Cloud One of the big trends over the last few years is to have ETL delivered in the cloud. The question is that, how does ETL work on cloud-based architecture when the data is often on-premise? If the data is on-premise then the data processing is on-premise, likewise if the data is in an off-site then the data processing should be in an off-site data center. Traditional ETL tools followed a three-tier architecture, this means they are split up into three parts, they are: Design interface for the user Metadata repository Processing layer ETL Three Tier Architecture All these three layers are designed to work within the four walls of your organization. To cloud enable these platforms in an on-premise scenario, the two functions of user interface and metadata repository are taken to the cloud. However the processing engine stayed on-premise, so when the processing engine was suppose to operate, it would receive the appropriate commands and information from the cloud metadata repository. The processing engine would run that data movement routine on-premise, this allows the data to live where it natively is rather than requiring all the data to move to the cloud. When something needs to be run in the cloud then another engine in the cloud would run that data. The storage and design of the ETL movement are hosted by the cloud ETL vendor but the engine that processes the commands can sit in multiple locations. Talend Data Integration The process of merging data from various sources into a single view is known as data integration. Starting from mapping, ingestion, cleansing and transforming to a destination sink, and making data valuable and actionable for the individual who access it. Talend offers strong data integration tools for performing ETL processes. As the data integration is complex and slow process, talend solves the problem by completing the integration jobs 10x faster than manual programming with a very low cost. Talend data integration has two versions they are: Talend data management platform Talend open source data integration. Talend Open Studio (An ETL tool from Talend) The most powerful open-source data integration tool available in the market is talend open studio. This ETL tool helps you to effortlessly manage various steps involved in an ETL process, starting from the basic design of the ETL till the execution of ETL data load. Talend open studio is based on graphical user interface using which you can simply map data between the source and target areas. All you need to do is selecting the required components from the palette and placing them into the workspace. It also offers you with a metadata repository from where you can simply reuse and repurpose the work; this process will help you increase the productivity and efficiency over time. Advantages of ETL tools Ease of Use ETL tool is very easy to use as the tool itself identifies data sources and the rules for extracting and data processing. This process eliminates the need of manual programming methods, where you are required to write the code and procedures. Visual Data Flow To represent the visual flow of the logic, GUI is required. The ETL tools are based on Graphical User Interface which enables you to specify instructions using a drag-drop method to represent the data flow in a process. Operational Resilience Most of the data warehouses are delicate and many operational problems arise. To reduce these problems ETL tools possess in-built debugging functionality which enables data engineers to build on the features of an ETL tool to develop a well-structured ETL system. Simplify Complex Data Management Situations Moving large volumes of data and transferring them in batches becomes easier with the help of ETL tools. These tools handle complex rules and transformations and assist you with the string manipulations, calculations and data changes. Richer data cleansing ETL tools are equipped with advanced cleansing functions when compared with ones present in SQL. These functions serve to the requirements of complex transformations which usually occur in a complex data warehouse. Performance The overall structure of an ETL system minimizes the efforts in building an advanced data warehousing system. Additionally, many ETL tools emerge with performance improving technologies like Massively Parallel Processing, Cluster Awareness and Symmetric Multi-Processing. Frequently Asked TALEND Interview Questions & Answers Various categories of ETL Tools ETL tools allow organizations to make their data meaningful, accessible and usable across diverse data systems. Choosing a right ETL tool is crucial and complex as there are many tools available. As there are many ETL tools available, we have divided them into four categories according to the organization needs: Open-Source ETL tools Similar to other aspects of software infrastructure, ETL has a huge demand for open source tools and projects. These open-source tools are created for maintaining scheduled workflows and batch processes. Cloud-native ETL tools With most of the data moving to the cloud, Many cloud related ETL services started to evolve. Few of them stick to the basic batch model while others start to offer intelligent schema detection, real-time support and more. Real-time ETL tools Performing your ETL in the mode of batches makes sense only when you are not in need of real-time data. This batch process works better for tax calculations and salary reporting. Although, all the modern applications need a real-time data access from various sources. For instance when you upload an image to Instagram account, you want your friends to notice it immediately, not a day later. Batch ETL tools Almost every ETL tool in the world is based on batch processing and on-premise. In the past, most of the organizations used to utilize their database resources and free compute to perform overnight batch processing of ETL jobs and consolidating data during off-hours. Future Scope of Talend ETL tool Every day the organizations get huge volumes of data through enquiries, emails and service requests. For an organization, it becomes a priority task to handle the data efficiently to ensure success. The future of the organization depends on how well they handle the data to maintain healthy customer relationship. Managing data becomes easier with the help of ETL tools which improve data processing and increase productivity. Most desired job profiles related to Talend are Talend ETL developer, Talend developer and Talend Admin. There are many job profiles available in the domain of talend as it is a rewarding career path and has best opportunities in Big Data. There is a great demand for job aspirants with ETL skills due to the need of large data handling efficiency. According to Ziprecruiter website the average salary quoted for a Talend ETL developer in USA is $126,544 per year as on Oct 7, 201 Explore Talend Sample Resumes! Download & Edit, Get Noticed by Top Employers! Download Now!
This Microsoft Azure Tutorial provides you with an in-depth knowledge on the set of Azure cloud services that are helping your organization meet its business challenges. Before jumping into the MS Azure tutorials directly, let’s have a look at what cloud computing is all about. Introduction to Cloud Computing Kinds of Cloud Computing Services Types of Cloud Deployments Azure Tutorial Forefront Identity Manager (Microsoft Identity Manager) Azure Cloud Services? Azure Management Portal Why do you need Azure Certification Azure Trends Azure Competition Azure Training Azure vs AWS Azure Security Introduction to Cloud Computing Today, cloud computing is a term that is used everywhere. This is because, over the past 10 years, the shift to the internet from traditional software models has gained a lot of momentum. The near future of cloud computing promises new and innovative ways to collaborate everywhere via mobile devices. So, what actually is cloud computing? Cloud computing is defined as the on-demand delivery of computing services such as software, networking, databases, storage, servers, analytics etc, over the internet. The organizations that offer these computing services are known as cloud providers. Scenario Before Cloud Computing The traditional business applications have always been quite expensive and complicated. The variety and amount of software and hardware needed to run them are daunting. A whole team of experts is required to configure, install, update, secure, run, and test them. So, if we scale this effort for multiple efforts, you can understand why global corporations having the best IT departments are not acquiring the applications they need. Medium and small businesses cannot stand a chance. Scenario After Cloud Computing The headaches that come with managing software and hardware for various computing services can be eliminated by cloud computing. This is because managing software and hardware rests with the cloud service provider. Also, you can only pay for what you use, scaling up or down is simple, and upgrades are automatic. Working of Cloud Computing Cloud computing offers an easy way to access a broad set of application services, databases, storage, and servers over the internet. Microsoft Azure and various other cloud computing platforms owns and manages the hardware connected to the network that is needed for these application services. You provision and utilize what you require through a web application. Benefits of Cloud Computing From global corporations to small startups, non-profits, and government agencies, a variety of organizations are embracing cloud computing for all kinds of reasons. The following are the things that are possible with the cloud. Backup, store, and recover data. Deploy your application easily in multiple locations. Stop sending money on maintaining as well as running data centers. Create new services and apps. Host blogs and websites. Eliminate guessing on infrastructure capacity needs. Increase agility and speed. Get ahead in your career by learning Azure through Mindmajix Microsoft Azure Training. Kinds of Cloud Computing Services Cloud computing services fall under 3 major categories - software as a Service(SaaS), platform as a service(PaaS), and infrastructure as a service(IaaS). Infrastructure as a Service(IaaS) - The IaaS providers will supply the infrastructure such as virtual machines(VMs), servers, operating systems, networks, and storage, and you can pay for what you use. Platform as a Service(PaaS) - With the help of gateway software, web portals, or APIs(Application Programming Interfaces), users can access the host development tools once cloud providers host them on their infrastructures. Software as a Service(SaaS) - SaaS is a distribution model that provides software applications(often called web services) over the internet. The SaaS services and applications can be accessed by users from any location with the help of a mobile device or a computer that has internet access. Related Page: Introduction To Azure SaaS Types of Cloud Deployments The cloud computing resources can be deployed in 3 ways. They are public cloud, private cloud, and hybrid cloud. Public Cloud - Public clouds are owned and maintained by third-party cloud service providers. They deliver their computing resources such as storage and servers over the internet. The most popular example of a public cloud is Microsoft Azure. Private Cloud - In a private cloud, the infrastructure and services are operated on a private network. It refers to cloud computing resources extensively utilized by an organization or a single business. Hybrid Cloud - Hybrid cloud combines private and public clouds, bound together by technology that enables applications and data to be shared between them. Hybrid cloud provides businesses with more deployment options and greater flexibility by allowing applications and data to move between public and private clouds. Azure Tutorial - What is Microsoft Azure? By now, you must have known what cloud computing is all about. Now, it’s time to talk about one of the most prominent cloud computing platforms, Azure. Microsoft Azure is the public cloud computing platform developed by Microsoft. It offers a range of cloud services for networking, storage, analytics, computing, etc. For running existing applications or for developing and scaling new applications, users can pick and select from these services in the public cloud. This platform is widely considered both as an IaaS and PaaS offering. Related Page: Azure Cloud Computing and Services History of Azure In 2008, Microsoft revealed its plan to introduce a service for cloud computing known as Windows Azure. Azure’s preview versions had become available and matured and this led to its commercial launch in the year 2010. Though the Azure cloud services’ iterations feel behind the then established cloud service offerings like AWS(Amazon Web Services), the portfolio continued to grow and aid a higher base of operating systems, frameworks, and programming languages. Windows Azure was rebranded as Microsoft Azure in 2014 as the Microsoft identified that cloud computing’s implications stretched far beyond windows. Azure Costs and Pricing Azure primarily utilizes a model for pricing which states you can only pay for services you use. Nevertheless, if multiple Azure services are used by a single application, each service may involve multiple pricing tiers. Additionally, if a user makes a long-term commitment to certain services, like compute instances, Microsoft provides a discounted rate. A company should review and handle its utilization of the cloud to minimize costs as many factors are involved in pricing for cloud service. Azure-native tools like Azure cost management can help to optimize, visualize, and monitor cloud spend. It is also possible to utilize third-party tools like RightScale or Cloudability to handle Azure resource usage and the costs associated with it. Forefront Identity Manager (Microsoft Identity Manager) Image:Harbar Forefront Identity Manager (FIM), also known as Microsoft Identity Manager, is a self-service identity management software suite for handling role-based access control policies, credentials, and identities across heterogeneous computing environments. Forefront Identity Manager incorporates self-help tools in Microsoft Outlook so that the end-users can handle conventional aspects of access and identity like resetting their own passwords without the need for help desk assistance. It also enables end-users to create their own email distribution lists and security. IT administrators can utilize this software to manage smart cards and digital certificates. It also offers automation and administrative tools. FIM can be linked to Azure Active Directory with the help of tool FIM connector for Windows Azure Active Directory. This tool is used to synchronize on-premise data to Azure Active Directory in FIM. Once you download and install the tool, you can simply follow the wizard to connect on-cloud Azure Active Directory with your FIM information. Azure Load Balancer Azure Load Balancer can be termed as a cloud-based system that enables a set of machines to function as one single machine for serving user requests. The primary job of a load balancer is to take the client requests, see which machines in the set will be able to handle such requests,and forward these requests to relevant machines. For the creation of a public load balancer by using Azure portal, For in-depth information on Azure Load Balancer, you can view here. Azure Data Factory The Azure Data Factory is a completely managed service for processing, composing data storage, movement of services into reliable, scalable, and stream-lined production pipelines. The Azure Data Factory offers access to cloud data in Azure Storage and Azure SQL Database, and on-premises data in SQL Server. For on-premises data, the access is provided via a data management gateway that links to on-premises SQL Server databases. For an in-depth information on Azure Data Factory, you can see here. Azure Data Lake Azure Data Lake is a highly scalable public cloud service that enables developers, business professionals, scientists, and other Microsoft customers to obtain insight from complex, large datasets. Azure Data Lakes can be provisioned by customers for storing an unlimited amount of unstructured, semi-structured, or structured data from a variety of sources. For more information on Azure Data Lake, you can view here. Azure Cloud Services Azure cloud services are categorized into 18 major product types. They are as follows. Web - Web services support the web application development and deployment. It also provides features for reporting and notification, API management, content delivery, and search. Data storage - Data storage services offer scalable cloud storage for unstructured as well as structured data and also offer support for archival storage, persistent storage for containers, and big data projects. Analytics - Analytics services offer distributed analytics and storage, and features for big data analytics, real-time analytics, data lakes, machine learning, etc. Management - Management services offer a range of compliance, recovery, backup, monitoring, scheduling, and automation tools that can enable a cloud administrator to handle an Azure deployment. Mobile - Mobile products help a developer provide notification services, provide tools for building APIs, offer support for back-end tasks, develop cloud applications for mobile devices, and so on. Migration - Migration tools help a company forecast the costs for workload migration, and do the actual migration of workloads to the Azure cloud from local data centers. DevOps - DevOps group offers collaboration and project tools like Visual Studio Team Services that make the DevOps software development processes much easier to accomplish. Related Page: Introduction To Azure DevOps Databases - Databases category incorporates DBaaS(Database as a Service) offerings for NoSQL and SQL, and other database instances like Azure Database for PostgreSQL and Azure Cosmos DB. Azure Compute - Azure Compute services allow a user to manage as well as deploy containers, virtual machines(VMs), and batch processing. They also support remote application access. Containers - Container services help a company manage, orchestrate, register, and create high volumes of containers in the Azure cloud with the help of common platforms like Kubernetes and Docker. Machine Learning and Artificial Intelligence - This is a broad range of services that a developer can utilize to infuse cognitive computing, AI, and machine learning capabilities into data sets and applications. Related Page: Why Azure Machine Learning? Security -Azure products offers capabilities to detect and respond to cloud security threats,and manage encryption keys and various other sensitive assets. Development - Development services help application developers track potential issues, test applications, and share code. Internet of Things(IoT) - These services enable users to analyze, monitor, and capture IoT data from sensors and various other devices. Related Page: Azure IoT Edge Overview IAM(Identity and Access Management) - IAM offerings make sure that only the users who are authorized can access Azure services, and help in protecting encryption keys and other sensitive information in the cloud. Hybrid Integration - These services enable connecting public and private clouds, site recovery, and server backup. CDN(Content Delivery Network) and Media - These services include digital rights protection, on-demand streaming, encoding, indexing and media playback. Networking - Networking group includes gateways, dedicated connections, virtual networks, and services for load balancing, diagnostics, traffic management, network protection against DDoS(Distributed Denial of Service) attacks, and DNS(Domain Name System) hosting. Frequently asked Azure Interview Questions Azure Management Portal Microsoft Azure Management Portal is a simple way to observe and track all Azure subscriptions, spending and usage. The reporting features and dashboard will provide you with an in-depth understanding of Azure expenditure and consumption. The features of Azure Management Portal include: Strategize future usage and capacity Exploit your cloud data Optimize virtual machine size and scale Control billing and spend View all subscriptions at one place The latest Azure Portal is depicted in the below image. Image:aidanfinn Azure Resource Manager Basically, the infrastructure for your application is made up of various components such as database server, database, virtual network, storage account, and a virtual machine. You don’t view these elements as different entities,instead you view them as interdependent and related components of a single entity. You want to monitor, manage, and deploy them as a group. Related Page: Azure Monitor Azure Resource Manager will allow you to function with resources as a group in your solution. You can delete, update, or deploy all resources for your solution in a coordinated, single operation. A template can be used for deployment and it can work for various environments like production, staging, and testing. Azure Resource Manager offers features for tagging, auditing, and security to help you handle your resources after deployment. Consistent Management Layer - Resource Manager offers a consistent management layer to execute tasks via Azure PowerShell, client SDKs, REST API, and Azure portal. All the Azure portal’s capabilities are available via Azure PowerShell, client SDKs, Azure REST APIs, and Azure CLI. The below image depicts how the tools communicate with the Azure Resource Manager API. The Resource Manager service authenticates and authorizes requests once the API passes requests to it. Then, the requests are routed to the appropriate service providers by the Resource Manager. Image: Microsoft Azure Storage For modern data storage scenarios, Azure Storage is the cloud storage solution of Microsoft. Azure storage provides a highly scalable object store for data objects, a NoSQL Store, a messaging store for the purpose of reliable messaging, and a file system service for the cloud. The benefits of Azure Storage are as follows. Highly available and durable - In the event of transient hardware failures, redundancy makes sure your data is secure. Data can also be replicated across geographical regions or data centers for more protection from natural disaster or local catastrophe. In this way, data remains highly available even during an unexpected outage. Secure - The service encrypts all the data that is written to Azure Storage. Azure storage offers you a fine-grained control over who will access your data. Scalable - In order to meet the performance and data storage needs of today’s applications, Azure storage is designed to be highly scalable. Managed - Any critical problems and maintenance are handled by Microsoft Azure. Accessible - You can access Azure storage from anywhere in the world over HTTPS or HTTP Why do you need Azure Certification? Today, many organizations around the world are using the Azure cloud platform to drive their businesses, and this explains the importance of gaining expertise in Azure. But, you can showcase this expertise through an Azure certification. It is proof of your knowledge of Azure features, deployment, working, and management. Azure Trends With the demand for cloud computing increasing day-by-day and Microsoft being established as a leader in the cloud computing space, it is important to analyze what Microsoft Azure is up to and its impact in the near future. I have explained previously in this tutorial that Azure offers services in the form of IaaS. For most customers, the advantage with IaaS is decreased expense and complexity of handling data center infrastructure and physical servers. There are many companies that aren’t willing to keep all their eggs in the public cloud basket. Microsoft has identified this and created a low-level infrastructure private cloud service known as Azure Stack. Azure Stack provides companies with a hybrid cloud solution. The main idea behind the creation of Azure Stack is to give companies the power of cloud services yet allowing them to hold control of their data center for real hybrid cloud agility. Thus, Azure continues to evolve by enhancing accessibility to migration and various other services such as Virtual Machine on Linux, for example. In the near future, we will see many services that are cost-friendly and can become available for an ever-growing customer base who are looking for a simple, but trustable and powerful Azure cloud. Azure Competition Microsoft Azure is one of the major global providers of public cloud service. Other major providers include Amazon Web Services (AWS), IBM, and Google Cloud Platform (GCP). At present, there is no standardization among cloud capabilities or services. This implies no two providers of cloud services deliver the same service in the same manner utilizing the same integrations or APIs. This makes it difficult for a business to utilize more than one provider of cloud service to pursue a multi-cloud strategy, though third-party tools for cloud management can decrease some of these challenges. Azure Training IT Industry is going through a wave of innovation and is being powered by the cloud phenomenon. Azure online Training equips the learners with in-depth knowledge of the Azure concepts to effectively undertake various tasks as a developer, administrator, and database administrator. In Azure training, you will understand the basic and main cloud computing principles and how the implementation of these principles is done in Microsoft Azure. Azure vs AWS Two of the most trusted cloud platforms by businesses (old and new, big and small) all over the world are Microsoft Azure and AWS. There is a heavy discussion going on among businesses on which cloud platform to choose between Azure and AWS, and Azure vs AWS is a commonly seen analogy today. You can compare the feature set of both cloud platforms and decide on which one is suitable for your business. More explanation on this is given here. Azure Security Privacy and security are built into the Azure platform. You can get a unified view of security across all your cloud and on-premises workloads. You can locate and onboard new Azure resources automatically and apply security policies across your hybrid cloud workloads to ensure compliance along with security standards. You can also analyze, search, as well as connect security data from different sources, including firewalls and various other partner solutions. Thus, this MS Azure Tutorial has discussed the components and features of Azure and what it can do for your business as a powerful cloud computing platform. Explore Microsoft Azure Sample Resumes! Download & Edit, Get Noticed by Top Employers! Download Now!
Introduction Are you excited to learn more about Splunk Rest API, then here you are the right place to get full-fledged information about it. This Splunk Rest API will allow the users to access the similar information & functionality will be availability to the core system of the software and Splunk Web, with the help of API. The function of API can be classified into various types based on the various interface behavior & they are Run searches & Manage objects & configurations. Learn how to use Splunk, from beginner basics to advanced techniques, with online video tutorials taught by industry experts. Enroll for Free Splunk Training Demo! Basic introduction to Splunk Rest API The organization of Splunk Rest API would be probably around object & configuration resources which may be the named, single and the object that are stored with the help of splunkd like a job, a TCP raw input etc. All the resources are then formed into one similar type of collections in which every single collection will have some type of combination resources along with the other collections. Then, Application Program Interface comply with the rules to the REST which is said to be an architectural style where it has below given following properties. All the concerns will be separated such as access mechanism & data storage between the client as well as the server. A client-server interaction which is also a stateless by excluding the session concept. Operation data catching will be used to enhance the performance of request-response An interface which is generalized and also uniform for simplicity The arrangements of the layer of REST architectural components which are arranged such a way in a hierarchically & contain their respective information that is out of scope by avoiding source of information to the nodes. Most of these properties of architectural will be aligning with the implementation of REST API that can use the corresponding points and can also access domain resources with the help of it and can also be done by using HTTP protocol. Users can also use the same protocol for sending an request of the Application Program Interface to the server if the browser uses same protocol only. From the HTTP protocol maps to next of the Splunk platform resources, the URL addressing is defined as part of it & is also identified by the Uniform Resource identifier. What is Splunk REST API? An API (Application Programming Interface) is used to define Interfaces to a programming library or else framework for accessing functionality provided by framework or library. The Splunk Enterprise REST API will provide various methods or steps to access every product or feature. Here are the access methods provided by the Splunk REST API 1. Delete: A resource should be deleted 2. Get: Present state data can be associated to the list child resources or else any of the resources 3. Post: You can also create the resource data as well as update it & will also enable & disable resources functionality Related Article: Splunk Enterprise Checkout Splunk Tutorials How to connect to splunk? By using Splunk REST API, one must rely on or use the splunkd management port, 8089, and the secure HTTPS protocol. One can easily set the enableSplunkdSSL property on the server.conf file to false to use the unsecure, HTTP protocol. How to get data from REST APIs into Splunk? By using the REST Modular input, Firstly, go to the Splunkbase & download the latest release. Then go to SPLUNK _HOME/etc/apps & restart Splunk. Then, perform the configuration & then navigate to Manager, then Data Inputs & then to REST. After that, click on ‘NEW’ button to create a new REST input & fill up the fields that are noticed. After performing the entire process, search for the data which are in RESTful responses that are in JSON format, which will be very convenient for auto field extraction. What is the use of Splunk MINT REST API? The Splunk MINT REST API is used to retrieve insights, upload dSYMs & manage projects & teams as well. This interface will also incorporate with various elements of the REST tradition & will also make access to these actions in a consistent way. How to access endpoints & REST operations? To access endpoints & REST operation, Username & password is must where the Splunk users should have role & authorization that is based on the capability for using REST endpoints. An role of the administrator among the users like admin can have easy access to the information in the Splunk Web. Follow the below process to have a look at the roles that are given to a user, First, go to the settings, then access controls & click on the Users. Follow these process for determining the role capabilities, go to the settings, then go to access control & then click on Roles. The authentication session timeout is generally for one hour by default that is configurable using the session will be timeout setting in server.conf file general stanza. Frequently Asked splunk Interview Questions What is HTTP Status Codes in Splunk REST API? In addition to content data, the responses which you have received will be having the HTTP status codes, which are not even included in the endpoint descriptions as the implementation is following the standard of HTTP which is used to report status. There are certain noted for the documentation of status codes by giving vital importance for the endpoints or else the standard of the Splunk software will be different from the regular standard. Related Article: Splunk Alert And Report What do you mean by Atom Feed response? Splunk API responses sometimes use the Atom Syndication Format which is called as Atom Feed. Here are the some of the additions to the standard Atom feed XML: OpenSearch namespace declaration, totalResults node, startIndex node & itemsPerPage node. Related Article: Accessing and Updating Splunk API What are the main response elements? The important response message elements are listed here to view: Metadata encapsulating the content element. The > Key/value pair data payload. The endpoints return a list of entry elements which are sort by the entry name by default. What is Encoding scheme and explain it with examples? The Splunk REST APIs will probably support multiple encoding schemes but not all the schemes that are supported by the endpoints. The REST API Reference Manual will have the list of the valid encoding schemes of each endpoint. XML is considered as the default encoding scheme for most REST API endpoints and it is used by the documentation examples. Users can also append the output_mode parameter as a query string for specifying a supported encoding scheme besides XML. Here are some of the examples which are the responses returned by using different encoding schemes, scv, json, json_cols, json_rows, raw & XML. The endpoint also gives an error response if the specified encoding scheme is not supported. Explore Splunk Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now!
If you're looking for SQL Server Interview Questions for Experienced or Freshers, you are at right place. There are lot of opportunities from many reputed companies in the world. According to research, The average salary for SQL Server ranges from approximately $69,682 pa. So, You still have opportunity to move ahead in your career in SQL Server. Mindmajix offers Advanced SQL Server Interview Questions 2018 that helps you in cracking your interview & acquire dream career as SQL Server Developer. Learn how to use SQL Server, from beginner basics to advanced techniques, with online video tutorials taught by industry experts. Enroll for Free SQL Server Training Demo! SQL Server Interview Questions In what sequence SQL statement are processed? The clauses of the select are processed in the following sequence FROM clause WHERE clause GROUP BY clause HAVING clause SELECT clause ORDER BY clause TOP clause Can we write a distributed query and get some data which is located on other server and on Oracle Database ? SQL Server can be lined to any server provided it has OLE-DB provider from Microsoft to allow a link. E.g. Oracle has a OLE-DB provider for oracle that Microsoft provides to add it as linked server to SQL Server group. If we drop a table, does it also drop related objects like constraints, indexes, columns, defaults, Views and Stored Procedures ? YES, SQL Server drops all related objects, which exists inside a table like, constraints, indexes, columns, defaults etc. BUT dropping a table will not drop Views and Stored Procedures as they exists outside the table. How would you determine the time zone under which a database was operating? Can we add identity column to decimal datatype? YES, SQL Server support this What is the Difference between LEFT JOIN with WHERE clause & LEFT JOIN with no WHERE clause ? OUTER LEFT/RIGHT JOIN with WHERE clause can act like an INNER JOIN if not used wisely or logically. What are the Multiple ways to execute a dynamic query ? EXEC sp_executesql, EXECUTE() What is the Difference between COALESCE() & ISNULL() ? ISNULL accepts only 2 parameters. The first parameter is checked for NULL value, if it is NULL then the second parameter is returned, otherwise it returns first parameter. COALESCE accepts two or more parameters. One can apply 2 or as many parameters, but it returns only the first non NULL parameter, How do you generate file output from SQL? While using SQL Server Management Studio or Query Analyzer, we have an option in Menu BAR.QUERTY >> RESULT TO >> Result to FILE How do you prevent SQL Server from giving you informational messages during and after a SQL statement execution? SET NOCOUNT OFF By Mistake, Duplicate records exists in a table, how can we delete copy of a record ? ;with T as ( select * , row_number() over (partition by Emp_ID order by Emp_ID) as rank from employee ) delete from T where rank > 1 WHAT OPERATOR PERFORMS PATTERN MATCHING? Pattern matching operator is LIKE and it has to used with two attributes 1. % means matches zero or more characters and 2. _ ( underscore ) means matching exactly one character What’s the logical difference, if any, between the following SQL expressions? -- Statement 1 SELECT COUNT ( * ) FROM Employees -- Statement 2 SELECT SUM ( 1 ) FROM Employees They’re the same unless table Employee table is empty, in which case the first yields a one-column, one-row table containing a zero and the second yields a one-column, one-row table "containing a null." SQL Server Interview Questions And Answers Is it possible to update Views? If yes, How, If Not, Why? Yes, We can modify views but a DML statement on a join view can modify only one base table of the view (so even if the view is created upon a join of many tables, only one table, the key preserved table can be modified through the view). Could you please name different kinds of Joins available in SQL Server ? OUTER JOIN – LEFT, RIGHT, CROSS, FULL ; INNER JOIN How important do you consider cursors or while loops for a transactional database? would like to avoid cursor in OLTP database as much as possible, Cursors are mainly only used for maintenance or warehouse operations. What is a correlated sub query? When a sub query is tied to the outer query. Mostly used in self joins. What is faster, a correlated sub query or an inner join? Correlated sub query. You are supposed to work on SQL optimization and given a choice which one runs faster, a correlated sub query or an exists? Exists Can we call .DLL from SQL server? YES, We can call .Dll from SQL Server. What are the pros and cons of putting a scalar function in a queries select list or in the where clause? Should be avoided if possible as Scalar functions in these places make the query slow down dramatically. What is the difference between truncate and drop statement? What is the difference between truncate and delete statement? What are user defined data types and when you should go for them? User-defined data types let you extend the base SQL Server data types by providing a descriptive name, and format to the database. Take for example, in your database, there is a column called Flight_Num which appears in many tables. In all these tables it should be varchar(8). In this case you could create a user defined data type calledFlight_num_type of varchar(8) and use it across all your tables. See sp_addtype, sp_droptype in books online. Can You Explain Integration Between SQL Server 2005 And Visual Studio 2005 ? This integration provide wider range of development with the help of CLR for database server because CLR helps developers to get flexibility for developing database applications and also provides language interoperability just like Visual C++, Visual Basic .Net and Visual C# .Net. The CLR helps developers to get the arrays, classes and exception handling available through programming languages such as Visual C++ or Visual C# which is use in stored procedures, functions and triggers for creating database application dynamically and also provide more efficient reuse of code and faster execution of complex tasks. We particularly liked the error-checking powers of the CLR environment, which reduces run-time errors SQL Server Interview Questions And Answers For Experienced You are being assigned to create a complex View and you have completed that task and that view is ready to be get pushed to production server now. you are supposed to fill a deployment form before any change is pushed to production server. One of the Filed in that deployment form asked, “Expected Storage requirement”. What all factors you will consider to calculate storage requirement for that view ? Very tricky, View, doesn’t takes space in Database, Views are virtual tables. Storage is required to store Index, incase you are developing a indexed view. What is Index, cluster index and non cluster index ? Clustered Index:- A Clustered index is a special type of index that reorders the way records in the table are physically stored. Therefore table may have only one clustered index.Non-Clustered Index:- A Non-Clustered index is a special type of index in which the logical order of the index does not match the physical stored order of the rows in the disk. The leaf nodes of a non-clustered index does not consists of the data pages. instead the leaf node contains index rows. Write down the general syntax for a SELECT statements covering all the options. Here’s the basic syntax: (Also checkout SELECT in books online for advanced syntax). SELECT select_list [INTO new_table_] FROM table_source [WHERE search_condition] [GROUP BY group_by__expression] [HAVING search_condition] [ORDER BY order__expression [ASC | DESC] ] What is a join and explain different types of joins? Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table. Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS. For more information see pages from books online titled: "Join Fundamentals" and "Using Joins". What is OSQL utility ? OSQL is command line tool which is used execute query and display the result same a query analyzer but everything is in command prompt. What Is Difference Between OSQL And Query Analyzer ? OSQL is command line tool which executes query and display the result same a query analyzer but query analyzer is graphical and OSQL is a command line tool. OSQL is quite useful for batch processing or executing remote queries. What Is Cascade delete / update ? CASCADE allows deletions or updates of key values to cascade through the tables defined to have foreign key relationships that can be traced back to the table on which the modification is performed. SQL Server Interview Questions For 2-5 Years Experienced What are some of the join algorithms used when SQL Server joins tables. Loop Join (indexed keys unordered) Merge Join (indexed keys ordered) Hash Join (non-indexed keys) What is maximum number of tables that can joins in a single query ? 256, check SQL Server Limits What is Magic Tables in SQL Server ? The MAGIC tables are automatically created and dropped, in case you use TRIGGERS. SQL Server has two magic tables named, INSERTED and DELETED These are mantained by SQL server for there Internal processing. When we use update insert or delete on tables these magic tables are used.These are not physical tables but are Internal tables.When ever we use insert statement is fired the Inserted table is populated with newly inserted Row and when ever delete statement is fired the Deleted table is populated with the delete d row.But in case of update statement is fired both Inserted and Deleted table used for records the Original row before updation get store in Deleted table and new row Updated get store in Inserted table. Can we disable a triger?, if yes HOW ? YES, we can disable a single trigger on the database by using “DISABLE TRIGGER triggerName ON <>” we also have an option to disable all the trigger by using, “DISABLE Trigger ALL ON ALL SERVER” Why you need indexing? where that is Stored and what you mean by schema object? For what purpose we are using view? We can’t create an Index on Index.. Index is stoed in user_index table. Every object that has been created on Schema is Schema Object like Table, View etc. If we want to share the particular data to various users we have to use the virtual table for the Base table. So that is a view. Indexing is used for faster search or to retrieve data faster from various table. Schema containing set of tables, basically schema means logical separation of the database. View is crated for faster retrieval of data. It’s customized virtual table. we can create a single view of multiple tables. Only the drawback is..view needs to be get refreshed for retrieving updated data. What the difference between UNION and UNIONALL? Union will remove the duplicate rows from the result set while Union all does’nt. Which system table contains information on constraints on all the tables created ? USER_CONSTRAINTS, system table contains information on constraints on all the tables created SQL Server Joins Interview Questions Frequently Asked SQL Server Interview Questions & Answers What are different Types of Join? Cross Join A cross join that does not have a WHERE clause produces the Cartesian product of the tables involved in the join. The size of a Cartesian product result set is the number of rows in the first table multiplied by the number of rows in the second table. The common example is when company wants to combine each product with a pricing table to analyze each product at each price. Inner Join A join that displays only the rows that have a match in both joined tables is known as inner Join. This is the default type of join in the Query and View Designer. Outer Join A join that includes rows even if they do not have related rows in the joined table is an Outer Join. You can create three different outer join to specify the unmatched rows to be included: Left Outer Join: In Left Outer Join all rows in the first-named table i.e. "left" table, which appears leftmost in the JOIN clause are included. Unmatched rows in the right table do not appear. Right Outer Join: In Right Outer Join all rows in the second-named table i.e. "right" table, which appears rightmost in the JOIN clause are included. Unmatched rows in the left table are not included. Full Outer Join: In Full Outer Join all rows in all joined tables are included, whether they are matched or not. Self Join This is a particular case when one table joins to itself, with one or two aliases to avoid confusion. A self join can be of any type, as long as the joined tables are the same. A self join is rather unique in that it involves a relationship with only one table. The common example is when company has a hierarchal reporting structure whereby one member of staff reports to another. Self Join can be Outer Join or Inner Join. What is Data-Warehousing? Subject-oriented, meaning that the data in the database is organized so that all the data elements relating to the same real-world event or object are linked together; Time-variant, meaning that the changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time; Non-volatile, meaning that data in the database is never over-written or deleted, once committed, the data is static, read-only, but retained for future reporting. Integrated, meaning that the database contains data from most or all of an organization’s operational applications, and that this data is made consistent. What is a live lock? A live lock is one, where a request for an exclusive lock is repeatedly denied because a series of overlapping shared locks keeps interfering. SQL Server detects the situation after four denials and refuses further shared locks. A live lock also occurs when read transactions monopolize a table or page, forcing a write transaction to wait indefinitely. How SQL Server executes a statement with nested subqueries? When SQL Server executes a statement with nested subqueries, it always executes the innermost query first. This query passes its results to the next query and so on until it reaches the outermost query. It is the outermost query that returns a result set. How do you add a column to an existing table? ALTER TABLE Department ADD (AGE, NUMBER); Can one drop a column from a table? YES, to delete a column in a table, use ALTER TABLE table_name DROP COLUMN column_name Which statement do you use to eliminate padded spaces between the month and day values in a function TO_CHAR(SYSDATE,’Month, DD, YYYY’) ? To remove padded spaces, you use the "fm" prefix before the date element that contains the spaces. TO_CHAR(SYSDATE,’fmMonth DD, YYYY’) Which operator do you use to return all of the rows from one query except rows are returned in a second query? You use the EXCEPT operator to return all rows from one query except where duplicate rows are found in a second query. The UNION operator returns all rows from both queries minus duplicates. The UNION ALL operator returns all rows from both queries including duplicates. The INTERSECT operator returns only those rows that exist in both queries. How will you create a column alias? The AS keyword is optional when specifying a column alias. In what sequence SQL statement are processed? The clauses of the subselect are processed in the following sequence (DB2): 1. FROM clause 2. WHERE clause 3. GROUP BY clause 4. HAVING clause 5. SELECT clause 6. ORDER BY clause 7. FETCH FIRST clause How can we determine what objects a user-defined function depends upon? sp_depends system stored procedure or query the sysdepends system table to return a list of objects that a user-defined function depends upon SELECT DISTINCT so1.name, so2.name FROM sysobjects so1 INNER JOIN sysdepends sd ON so1.id = sd.id INNER JOIN sysobjects so2 ON so2.id = sd.depid WHERE so1.name = '<>' What is lock escalation ? A query first takes the lowest level lock possible with the smallest footprint (row-level). When too many rows are locked (requiring too much RAM) the lock is escalated to a range or page lock. If too many pages are locked, it may escalate to a table lock. What are the main differences between #temp tables and @table variables and which one is preferred ? SQL Server can create column statistics on #temp tables Indexes can be created on #temp tables @table variables are stored in memory up to a certain threshold. What are Checkpoint In SQL Server ? When we done operation on SQL SERVER that is not commited directly to the database.All operation must be logged in to Transaction Log files after that they should be done on to the main database.CheckPoint are the point which alert Sql Server to save all the data to main database if no check point is there then log files get full we can use Checkpoint command to commit all data in the SQL SERVER.When we stop the SQL Server it will take long time because Checkpoint is also fired. Related Page: SQL Interview Questions For 5+ Years Experienced Why we use OPENXML clause? OPENXML parses the XML data in SQL Server in an efficient manner. It’s primary ability is to insert XML data to the DB. Can we store we store PDF files inside SQL Server table ? YES, we can store this sort of data using a blob datatype. Can we store Videos inside SQL Server table ? YES, we can store Videos inside SQL Server by using FILESTREAM datatype, which was introduced in SQL Server 2008. Can we hide the definition of a stored procedure from a user ? YES, while creating stored procedure we can use WITH ENCRYPTION which will convert the original text of the CREATE PROCEDURE statement to an encrypted format. What are included columns when we talk about SQL Server indexing? Indexed with included columns were developed in SQL Server 2005 that assists in covering queries. Indexes with Included Columns are non clustered indexes that have the following benefits: Columns defined in the include statement, called non-key columns, are not counted in the number of columns by the Database Engine. Columns that previously could not be used in queries, like nvarchar(max), can be included as a non-key column. A maximum of 1023 additional columns can be used as non-key columns. What is an execution plan? How would you view the execution plan? An execution plan is basically a road map that graphically or textually shows the data retrieval methods chosen by the SQL Server query optimizer for a stored procedure or ad-hoc query and is a very useful tool for a developer to understand the performance characteristics of a query or stored procedure since the plan is the one that SQL Server will place in its cache and use to execute the stored procedure or query. From within Query Analyzer is an option called "Show Execution Plan" (located on the Query drop-down menu). If this option is turned on it will display query execution plan in separate window when query is ran again. Explain UNION, MINUS, UNION ALL, INTERSECT ? INTERSECT returns all distinct rows selected by both queries. MINUS – returns all distinct rows selected by the first query but not by the second. UNION – returns all distinct rows selected by either query UNION ALL - returns all rows selected by either query, including all duplicates Explore SQL Server Sample Resumes! Download & Edit, Get Noticed by Top Employers!Download Now! SQL SERVER Query Interview Questions with Answers SQL Server DATEADD() Function Q) Write a Query to display the date after 15 days? SELECT DATEADD(dd, 15,getdate()) Q) Write a Query to display date after 12 months? SELECT DATEADD(mm, 2, getdate()) Q) Write a Query to display date before 15 days? SELECT DATEADD(dd, -15, getdate()) SQL Server DATEDIFF() Function Q) Write a Query to display employee details along with exp? SELECT * DATEDIFF(yy, doj, getdate()) AS ‘Exp’ FROM employee Q) Write a Query to display employee details who is working in ECE department & who his having more than 3 years of exp? SELECT * DATEDIFF(yy, doj, getdate()) AS ‘Exp’ FROM employee WHERE DATEDIFF(yy, doj, getdate())>3 AND dept_name=’ECE’ Q) Write a Query to display employee details along with age? SELECT * DATEDIFF(yy, dob, getdate()) AS ‘Age’ FROM employee Q) Write a Query to display employee details whose age >18? SELECT * DATEDIFF(yy, dob, getdate()) AS ‘Age’ FROM employee WHERE DATEDIFF(yy, dob, getdate())>18 SQL Server Multi Row Functions Q) Write a Query to display minimum salary of an employee? SELECT MIN (salary) FROM employee Q) Write a Query to display maximum salary of an employee? SELECT MAX(salary) FROM employee Q) Write a Query to display total salary of all employees? SELECT SUM(salary) FROM employee Q) Write a Query to display average salary of an employee? SELECT AVG(salary) FROM employee Q) Write a Query to count the number of employees working in the company? SELECT COUNT(*) FROM employee Q) Write a Query to display minimum & maximum salary of employee? SELECT MIN(salary) AS ‘min sal’, MAX(salary) AS ‘max sal’ FROM employee Q) Write a Query to count the number of employee working in ECE department? SELECT COUNT(*) FROM employee WHERE dept_name=’ECE’ Q) Write a Query to display second max salary of an employee? SELECT MAX(salary) FROM employee WHERE salary < (SELECT MAX(salary) FROM emp) Q) Write a Query to display third max salary of an employe? SELECT MAX(salary) FROM employee WHERE salary < (SELECT MAX(salary) FROM emp where salary < (SELECT MAX(salary) FROM emp)) SQL SERVER: GROUP BY Clause Q) Write a Query to display total salary of employee based on city? SELECT city, SUM(salary) FROM employee GROUP BY city; Q) Write a Query to display number of employee based on city? SELECT city, COUNT(emp_no) FROM employee GROUP BY city; (OR) SELECT city, COUNT(emp_no) AS ‘no.of employees’ FROM employee GROUP BY city; Q) Write a Query to display total salary of employee based on region? SELECT region, SUM(salary) AS ‘total_salary’ FROM employee GROUP BY region; Q) Write a Query to display the number of employees working in each region? SELECT region, COUNT(gender) FROM employee GROUP BY region; (OR) SELECT region, COUNT(gender) AS ‘no.of males’ FROM employee GROUP BY region; Q) Write a Query to display minimum salary & maximum salary based on dept_name? SELECT dept_name, MIN(salary) AS ‘min sal’, MAX(salary) AS ‘max sal’ FROM employee GROUP BY dept_name Q) Write a Query to display the total salary of employee based on dept_name? SELECT dept_name, SUM(salary) AS ‘total_sal’ FROM employee GROUP BY dept_name Q) Write a Query to display no.of males in each department? SELECT dept_name, COUNT(gender) FROM employee GROUP BY dept_name WHERE gender=’male’ (OR) SELECT dept_name, COUNT(gender) AS ‘no.of males’ FROM employee WHERE gender=’male’ GROUP BY dept_name; Note:: We cannot apply where condition in GROUP BY CLAUSE if we want apply use having clause. We have to use WHERE condition before GROUP BY but cannot apply where condition after GROUP BY. SQL SERVER: Having Clause Q) Write a Query to display total salary of employee based on whose total salary > 12000? SELECT city, SUM(salary) AS ‘total_salary’ FROM employee GROUP BY city HAVING SUM(salary)>12000; Q) Write a Query to display the total salary of all employees based on city whose average salary >= 23000? SELECT city, SUM(salary) AS ‘total_salary’ FROM employee GROUP BY city HAVING AVG(salary) >= 23000; SQL SERVER: SUB QUERIES Q) Write a Query to display employee details whose employee numbers are 101, 102? SELECT * FROM employee WHERE Emp_No in (101, 102) (OR) SELECT * FROM employee WHERE Emp_No in (select emp_no from emp) Q) Write a Query to display employee details belongs to ECE department? SELECT Emp_No, Emp_Name, Salary FROM employee WHERE dept_no in (select dept_no from dept where dept_name = ‘ECE’) SQL SERVER TOP Clause Q) Write a Query to display first record from the table? SELECT TOP 1 * FROM employee Q) Write a Query to display top 3 records from the table? SELECT TOP 3 * FROM employee Q) Write a Query to display last record from the table? SELECT TOP 1 * FROM employee ORDER BY emp_no descending SQL SERVER: Ranking Functions Student Details Table: Student_No Student_Name Percentage Row_ID Rank_ID DenseRank_ID 105 James 87 1 1 1 106 John 83 2 2 2 101 Anil 83 3 2 2 104 Vijay 83 4 2 2 108 Rakesh 76 5 5 3 102 Sunil 76 6 5 3 103 Ajay 76 7 5 3 107 Ram 75 8 8 4 Q) Write a Query to display student details along with the row_no order by student name? SELECT *, ROW_NUMBER() OVER (ORDER BYstudent_name) AS ‘Row_ID’ FROM employee Q) Write a Query to display even records from the table? SELECT * FROM ( SELECT *, ROW_NUMBER() OVER (ORDER BY student_no) AS ‘ Row_ID’ FROM student) WHERE row_id %2=0 Q) Write a Query to display odd records from student table? SELECT * FROM (SELECT *, ROW_NUMBER() OVER (ORDER BY student_no) AS Row_ID FROM student) WHERE row_id %2!=0 List of Related Microsoft Certification Courses: SSIS Power BI SSRS SharePoint SSAS SQL Server DBA SCCM BizTalk Server Team Foundation Server BizTalk Server Administrator