R
Looks like you don't really understand how the indices work. The simplest explanation will be the index in the book. We're looking for a definition of a term. We'll find him in the subject-matter, and then we'll go to the pages on it and we' definition it. And in contrast to this, looking at the entire book looking for a reference to our term. Feel the difference in time.Databases usually store information on fixed-size memory pages. For example, SQL Server uses 8 kilobet pages (8192 bayta), of which 8060 baytes are available for user data and the rest of the service information.SQL Server uses a structure called a balanced tree (B-tree). The tree consists of a root knot (root node) containing one page, several intermediate levels (intermediate levels) containing additional pages and a leaf level (leaf level).
The pages of the leaflet level contain separate elements corresponding to indexed data. The number of lines on the index page depends on the size of the indexed columns. SQLer creates intermediate levels using the first element of each page of the leaflet level and retaining the elements on the page together with the indicator on the leaf page. The root page is similar. Let's say we need to find the term SQL Server. There's an example of B-Tereva on the picture. First, the root page is checked. We know the alphabet S is between O and T. So we need to move on page O. This operation has already eliminated a third of the data. Now we look at the page of the intermediate level, we find the S meaning, and then we move directly to the page of the leaflet. Three pages of the index had to be read to search data. This statement will be correct for a term beginning on any letter. That is, a balanced tree means that the data will always need to read the same number of pages of the index.A little math:Let's say we need to create a pole index with a char(60). For each line of the table to be stored, it'll be 60 byte. We'll need 6,000 bays to store 100 lines. All this will fit on the same page of memory, so our index will only have one page, which is both the root and the leaf.
There are 134 lines for one strand (8040 Byte) and two additional pages will have to be created when the 135 line is added. Our tree will contain three pages: root and two leaflets. The first page of the leaflet will be the first half of the elements, the second to the second. We'll have two lines on the front page. We don't need an intermediate level.The table could be added up to 17956 lines, with the number of index levels not being offset. It will consist of 134 pages for 134 elements each. The original page will have the same 134 elements. In addition to table 17957, a further page and 135 element should be added in the root, but it will no longer be available, so we will have to add an intermediate level containing two pages. The first page will contain the starting elements of the first half of the page of the leaf-level pages, and the second will be the beginnings of the second half of the page of the leaflet. The root page will contain two lines corresponding to the initial values of two pages of the intermediate level.
When the table is added to the 2406105th line, another intermediate level will be established.Thus, only three pages of data should be examined to find a line in the table of approximately 2.5 million lines. And only when the table goes over 300 million, we'll have to look at four pages.And if we index a pole with a type not char(60) but with a type of int that holds 4 bayta, we'll have to look only one page until 2016 is added, or two pages are added until the table passes 4 million pages, or three pages, until the number of records exceeds 8 billion.B-tree is the most common type of index, but not the only one. Other types are used for specific purposes in different CBDs. The subject is very broad and there are many materials on it.So, back to the indices themselves. They are clustered and non-clasterized. The clustered index in the table can only be one, and in fact it stores the whole table, that is, its leaf level is based on the real table. A classic case is where there is a field ID in the table and a clustered index has been created. Usually, in a normally designed relay OBD, each table has a cluster index, although there are exceptions.The non-clasterized index, as far as possible, stores not all the data, but keeps the index on our clustered index, which can deteriorate the other columns.For example, we have a table with individuals, with ID, FIO, birth date, INN. At the same time, we have a clustered ID index and a non-clustered INN. We want to get a man's FIO by the commission. Then we're looking for our INN in a non-clusterised index, and we're looking for our ID values that match people, and then we're looking for a cluster index and we're looking for the right FIS. If there are many lines in the table, the difference between the full reading of all the records and the reading of the index will be colorful. If the lines in the table are small, the query optimization may not even use the non-clasterized INN index, as it decides that it will be easier to read the whole table. There's a lot of thinness.There's also a concept like covering the index. For example, we have several INNs that we want to check the existence of our table (as described above) and to remove only the existing INN table. Then we find our INNs in a non-clusterised index, and we don't need to into the clustered, because we've got all the data we need. These are indices that contain all the necessary data immediately and are referred to as covering.Let's say that the same table contains an index for FIO and the date of birth. And we need to pick up all the records that have the date of birth in a given range and get them out. We do not have a suitable index for filtering, but there are two indices (now FIS + DV and clustered) where all necessary data for withdrawal are available. There's still a need to read completely, and the logical choice will be the index that has a smaller size. Accordingly, the request optimizer will select our cover index with FIO + DR and read it fully instead of reading the whole table (clastary index). And if we have a hundred poles in this table, the difference in the size of clustered and non-exhaustrated indices will vary once, if not a dozen times.