Navigation

    SOFTWARE TESTING

    • Register
    • Login
    • Search
    • Job Openings
    • Freelance Jobs
    • Companies
    • Conferences
    • Courses
    1. Home
    2. shizuka
    3. Posts
    S
    • Profile
    • Following
    • Followers
    • Topics
    • Posts
    • Best
    • Groups

    Posts made by shizuka

    • Trying to understand this quiz question on INSERT

      Why will there be 2 rows and not 1? The rollback is meant to take it back to the last COMMIT function which in this case is inputted after one INSERT

      enter image description here

      posted in SQL
      S
      shizuka
    • RE: How to gauge performance of switching to PostgreSQL table partitioning?

      Is the dataset greater than RAM, but by less than 20 fold? Can you arrange it so that for one time period, you query mostly just one value of that column, then the next time period, mostly another value, and so on? If so, you should be able to get better cache usage.

      If not, then it is hard to see where the increased performance would come from. Maybe seeing an actual example query would help.

      posted in SQL
      S
      shizuka
    • RE: Mysql, failing xtrabackup and weird undo tablespaces metrics

      So, long story short. There is a bug in xtrabackup that can be hit when meeting certain conditions on the database size/layout/qps; for those who meet it it's 100% reproducible (on the other hand this version/certain binaries can successfully work with other instances). In the same time I managed to find a thread on a chinese server with half-chinese/half-english discussion describing the root cause of this (no traces in the english part of the world though).

      This bug has nothing to do with undo tablespaces metrics that can be seen. In fact, despite the zeroes, these tablespaces are fully functional.

      I was unable to reproduce the issue with xtrabackup 8.0.26.

      posted in SQL
      S
      shizuka
    • RE: MS Access and multiple users

      My experience with MS Access is similar in that it doesn't do well with multiple users. You're better off using an RDBMS that was designed for multiple users.

      In the US, hospitals have legal requirements for storing and protecting patient data. I would take local regulations into account when picking out an RDBMS. Cost being one of the lower priority requirements. Row Level Security (RLS) would be high on my decision matrix.

      IIRC - MS Access front end can connect to MS SQL (Express) RDBMS. It might be able to connect to other RDBMS via ODBC. So, all your UI work may not have been lost.

      As you grow, you'll have multiple applications interacting with the data. You should plan on that happening. I do this by putting most/all of my data logic in the database (stored procedures, etc). The front-end technology is then chosen based on how easy it can handle this type of design.

      Beyond that: "Which RDBMS should I use?" is off topic for this site.

      posted in SQL
      S
      shizuka
    • RE: Are missed rows and duplicate rows the symptoms due to allocation order scan or due to the no lock and page splits?

      NOLOCK results in an allocation order scan rather than an index order scan.

      This is not correct. Using read uncommitted means the storage engine has a choice between allocation-order and index-order scans. It may choose either strategy at runtime without the execution plan recompiling.

      NOLOCK doesn't block writes because it doesn't take shared locks on the table.

      Reading data at read uncommitted means shared locks are not taken at the row, page, or table granularity. Since these locks are not taken, they will not block an exclusive lock needed by a concurrent transaction to change data.

      During the NOLOCK's scan, since there are no shared locks, and a write (insert/update) happens prior to the point the scan has currently reached, then this situation will cause missing records.

      This is not specific to read uncommitted. Reading data using locking read committed or repeatable read can also miss committed data.

      https://docs.microsoft.com/en-us/archive/blogs/craigfr/read-committed-isolation-level normally releases a shared lock on a row just before reading the next row:

      read committed

      https://docs.microsoft.com/en-us/archive/blogs/craigfr/repeatable-read-isolation-level maintains shared locks to the end of the transaction, but only data actually encountered so far is locked. A row ahead of the current scan position can move behind the scan point if an index key value is changed:

      repeatable read

      prior images from Craig Freedman's posts linked inline

      The issues described above are specific to index-order scans.


      Similarly, during a write (insert/update), when page split happens after a row is read, and that row now is part of the next page (due to the page split), then this situation will cause duplicate records.

      Index-order scans do not care about page splits. Pages are linked in logical key order both before and after the split. An index-order scan will encounter rows on the page that split, and the new page arising from the split. There's no way to avoid this if you are following pages in key order.

      1. As shown in my examples above, the missing/duplicate records problem is cause due to no lock and page splits. Correct? Can it also be caused due to allocation order scans?

      No, those were examples of rows being missed or encountered multiple times due to rows moving in index order while an index-order scan is in progress.

      Rows being missed or encountered multiple times due to page splits is an issue that can only happen when an allocation-order scan is used. When scanning in allocation order, a page split moves some rows to a new page. That new page may or may not be encountered by the allocation-order scan. If the page that split has already been encountered, we see some rows again. If the split page had not been encountered yet, we might miss some rows if the new page falls behind the allocation-order scan position.

      Allocation-order scans are only possible without using read uncommitted if the engine has an acceptable guarantee that the data cannot change during the scan. Missing committed rows or encountering them multiple times due to an allocation-ordered scan over changing data is therefore specific to using read uncommitted isolation.

      1. If allocation ordered scan can indeed cause missing rows/duplicates, then I want to ask- suppose the engine had used an index order scan (I know that the engine won't do this but just assume for the sake of this question) rather than the allocation order scan, then how will it have solved the missing/duplicate rows problem?

      Using an index-order scan avoids missing/duplicate rows due to page splits, as described. Missing or duplicate rows can still occur due to index key changes, also as described above.

      Note again: Using read uncommitted does not guarantee you get an allocation-order scan.

      Further reading:

      • https://docs.microsoft.com/en-us/archive/blogs/sqlcat/previously-committed-rows-might-be-missed-if-nolock-hint-is-used by Lubor Kollar (SQL Server Development Team)
      • https://sqlperformance.com/2015/04/t-sql-queries/the-read-uncommitted-isolation-level by me
      • https://sqlperformance.com/2015/01/t-sql-queries/allocation-order-scans by me
      posted in SQL
      S
      shizuka
    • Postgres psql command intermittently 'stalls' in powershell

      I'm in the process of moving a postgres database to a new server.

      I have number of large tables, partitioned by month, each contains several 10's of millions of rows.

      I have a powershell script to do the following....

      • Copy the backup file (for a single partition) from archive location (using robocopy)
      • load the file back into the database using psql.
      • drop the backup file.
      • loop back to the beginning and move onto the next partition.

      Copying the file from the archive takes maybe 10 minutes, loading it typically 40 minutes, then it moves onto the next partition. The problem is that the PowerShell console seems to occasionally get 'stuck' after loading a partition. Occasionally, a partition seems to be taking too long (say an hour) I check the windows task manager, and none of the processes seem to be doing anything. I check the dashboard in pgAdmin4 (no locks, most sessions idle). Then I check the record count for the partition being loaded - the record count seems to be static (still the same 5 minutes later). So as far as I can tell the partition has been loaded successfully, and the database is mostly idle. But the powershell console that's running the load session hasn't changed, then I shift focus to the powershell console, press return a couple of times and suddenly it bursts into life. The console tells me it's moved onto the partition, the task-manager indicates that the powershell process and postgres process are now running again, pgAdmin4 also indicates the database has sprung back into life.

      The powershell script logs to screen using Write-host and the current time between each step, so I always know what the most recent step was. My conclusion is that the interaction between powershell and psql is somehow 'broken' - powershell hasn't worked out that psql has finished and doesn't know to move onto the next step.

      Other info - there's no sign of any errors (that I can spot) all the tables seem to get populated correctly (eventually) and the script works fine on all the smaller partitions. This only happens on about 1 partition in 10, the only cure seems to be pressing enter a couple of times.

      This isn't a critical problem, occasionally hitting enter a couple of times is a viable solution, but an understanding of what might be going on (or even a solution) would be much appreciated.

      posted in SQL
      S
      shizuka
    • RE: Compare signed integers as if they were unsigned

      You could compare like

      a > b OR a < 0 AND b >= 0
      

      An ORDER BY clause could be

      ORDER BY a < 0, a
      

      (which works because FALSE < TRUE).

      Or you could do the smart thing and use the data type numeric that allows you to store the numbers as they are.

      posted in SQL
      S
      shizuka
    • Why is statistics update slower on a clustered columnstore than on a rowstore?

      I've moved several large tables (each with >10^9 rows and a couple of dozens columns) from clustered rowstore to clustered columnstore indexes on a SQL Server 2014 instance and noticed that statistics updates on those (default sampling, triggered in our ETL or from Hallengren scripts) now take significantly longer.

      A more theoretical question is why is it so? My wild guess is that statistics update produces a lot of random reads which doesn't work well with the columnstore indexes, as they are more suited for sequential reads of large amounts of data. I would be happy to know a more "in-depth" explanation.

      More important question is whether I can do something against it. I've tried my test case for a table with a single bigint column (see below) on a SQL Server 2017 instance with the same result. Incremental statistics seems to be on a paper a good solution. I would need to recreate all statistics objects (which are currently not incremental, probably because of historical reasons), extend the ETL logic and update our version of the Hallengren scripts (we currently use an old one). I would appreciate if someone would share his/her experience before I go down this rabbit hole.

      Steps to reproduce:

      /*Create a rowstore and a columnstore table with a single bigint column*/
      CREATE TABLE dbo.rowstore (col1 BIGINT);
      GO
      

      CREATE TABLE dbo.columnstore (col1 BIGINT);
      GO

      CREATE CLUSTERED COLUMNSTORE INDEX CCI_columnstore ON dbo.columnstore;
      GO

      /Fill both tables with 400 * 10^6 rows. This results in a 15GB large rowstore and a 3,1GB large columnstore tables/
      ;WITH e1(n) AS
      (
      SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
      SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
      SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
      ), -- 10
      e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b), -- 1010
      e3(n) AS (SELECT 1 FROM e2 CROSS JOIN e2 AS b), -- 100
      100
      e4(n) AS (SELECT 1 FROM e3 CROSS JOIN e3 AS b) -- 10000*10000
      INSERT dbo.rowstore WITH (TABLOCK)
      SELECT CAST(CAST(NEWID() AS VARBINARY(8)) AS BIGINT) FROM e4;

      GO 4

      INSERT dbo.columnstore WITH (TABLOCK)
      SELECT * FROM dbo.rowstore
      GO

      /Trigger stats creation/
      SELECT TOP 1 * FROM dbo.rowstore WHERE col1>0

      SELECT TOP 1 * FROM dbo.columnstore WHERE col1>0
      GO

      SET STATISTICS TIME, IO ON

      /This runs 1,5 seconds/
      UPDATE STATISTICS dbo.rowstore

      /This runs 8 seconds and becomes much slower than rowstore on really large tables/
      UPDATE STATISTICS dbo.columnstore

      posted in SQL
      S
      shizuka
    • SQL Server Profiler - Client unable to establish connection

      Spanking brand-new installations of:

      • Visual Studio 2022 Community (VS)
      • SQL Server 2019 Dev (SQL)
      • SQL Server Management Studio v18.10 (SSMS)

      There are no other Visual Studio or SQL Server instances or installations.

      I can look at my local SQL instance using SSMS but not SQL Server Profiler. I am using Server Name "." and Windows Authentication in both cases. Profiler is being launched both in SSMS and directly launched (in administrative mode or not), with the same results: SSL Provider: The certificate chain was issued by an authority that is not trusted.

      SQL-related services are in these states (enabling SQL Server Browser has no effect):

      enter image description here

      "Trust server certificate" does not help. I suspect the provider is extremely sloppy about the root cause of this response.

      (Same system environment as https://stackoverflow.com/posts/70805318/ )

      posted in SQL
      S
      shizuka
    • RE: How to check the components that contribute to the size of the SQL data and log files?

      There is a DMV that breaks down this information by file per database:

      SELECT * FROM sys.master_files WHERE database_id = DB_ID('YourDB')

      https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-master-files-transact-sql?view=sql-server-ver15

      posted in SQL
      S
      shizuka
    • Will updating a row with identical values create entries for logical replication

      I am using Postgresql 14 and it is set up for logical replication. Trying to stream changes with debezium version 1.8. Actually, everything seems OK. Every insert, update, delete is streamed except the updates that update a value with the same. According to https://dba.stackexchange.com/questions/118178/does-updating-a-row-with-the-same-value-actually-update-the-row great answer, this kind of update creates a WAL record but is it actually tried to replicate over publication?

      Also need to mention that I asked the same question in debezium chat but they asked me back these two questions :

      1. The change is propagated to the replication slot
      2. The change is propagated into publication

      I would be happy if you can provide ways to validate those two.

      posted in SQL
      S
      shizuka
    • RE: Is there any harm in storing other rows' int PKs in a hierarchyID instead of an arbitrary number?

      I would ask - what is the the purpose of this HID?
      If it constructed from database-internal, numeric, surrogate key values, then you should never be showing them to a User anyway, so what's the purpose of assembling them in this way?

      Anyway ...

      The most important question to ask before deciding how to store any piece of Data is this:

      How are you going to use this Data?

      How are you going to "get to" this Data? (Do you need to index it?)

      What are you going to do with this Data, having got it? (Do you need to further subdivide it?)

      This last is interesting, because one thing you might want to do its to find an entry's parent (e.g. Seattle -> Washington). But how can you do that efficiently, armed only with the complete HID? You'd need to dissect the value, extracting the last element (which is almost always more difficult than getting the first) and then go look up the parent record.
      In such a case, it might be better to hold only the parent record's id and use a recursive query to derive the HID value completely dynamically.

      If your id values ever changed (which, of course, they never, ever should) then your full HIDs would be all fouled up and would have to regenerated en masse and that's painful.

      posted in SQL
      S
      shizuka
    • RE: Which TLS version is database mail using?

      If i'm right, SQL Server Database mail uses System.Net.Mail to do the work, and SQL Server 2014 database mail is built for .Net 3.5.

      The System.Net.Mail is able to send mail using TLS 1.2 when the build runtime version is 4.6 or above. So SQL Server 2014 database mail should not be able to support TLS 1.2 until some CU/SP.

      Maybe patching the 3.5 .NET framework would fix your issue.

      https://support.microsoft.com/en-us/topic/kb3135244-tls-1-2-support-for-microsoft-sql-server-e4472ef8-90a9-13c1-e4d8-44aad198cdbe

      posted in SQL
      S
      shizuka
    • RE: How do I fix this SQL server query 2008r2

      I'm not sure to understand your problem; please try this one:

      UPDATE x
      SET x.type= 'newid' 
      FROM Titem AS x
      INNER JOIN TCharacter AS y ON x.char_id = y.id
      INNER JOIN Titem AS z ON z.char_id = y.id
      WHERE x.type = 'oldid'
      and z.type = '2nd id';
      

      https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=9ee15cc0ad794b3f8da13da219018d94

      maybe you can add some data to dbfiddle...

      posted in SQL
      S
      shizuka
    • RE: Snapshot isolation transaction aborted due to update conflict when removing rows from the database

      The https://www.brentozar.com/pastetheplan/?id=SkT72_WRY show merge joins with full scans so all rows are touched. The culprit is the table variable in the dbo.spPurge procedure:

      declare @RedundantInsertionIDs table (InsertionID bigint, UpdateTime DateTime)
      

      This lacks an index to help optimize the delete queries. I observed that a primary key on the table variable https://www.brentozar.com/pastetheplan/?id=r1BGtvZAt such that the target tables are accessed with a seek instead of scan, touching only those rows to be deleted rather than the active rows too.

      declare @RedundantInsertionIDs table (InsertionID bigint primary key, UpdateTime DateTime);
      

      If you still experience issues, a temporary table instead of table variable may help. You could also implment a retry in the purge proc as a last resort.

      posted in SQL
      S
      shizuka
    • RE: How to install PostgreSQL Anonymizer 0.8.1 on Amazon RDS?

      The answer is described mostly already https://gitlab.com/dalibo/postgresql_anonymizer/-/issues/161 . I'll add it here as well to make it a bit easier to find and understand.

      ## clone the repo en cd into it
      git clone https://gitlab.com/dalibo/postgresql_anonymizer.git
      cd ./postgresql_anonymizer
      

      Checkout the version that allows standalone installation

      git checkout 0.8.1

      Create the anon_standalone.sql file

      make standalone

      Execute that sql file against your Amazon RDS instance

      psql -h -U -d -a -f anon_standalone.sql

      posted in SQL
      S
      shizuka
    • SQL Server Availability group and Backups

      SQL Server availability group backup preferences.

      We have 3 nodes for our AG.

      I would like to take full and differential backups on primary, log backups on secondary.

      When I configured "any replica" made backup priority 50 for all replicas log backups aren't running on secondary replica.

      If I do prefer secondary full backups wont run on primary.

      How can I configure to take full/diff on primary only and log backup on secondary only?

      The reason we wanted to offload backups to secondary replica is to improve performance on primary during peak hours.

      Our current setup takes full every 3 days and diff everyday after hours, we’ve log backups running every 15mins. It’s through SQL Agent job with custom SP.

      posted in SQL
      S
      shizuka
    • Using OPENJSON, can't get results to GROUP BY

      I'm passing a JSON string to a stored procedure in order to insert warehouse inventory.

      Sometimes, there will be multiple line items with the same product, going to the same shelf, on the same pallet (LPN), with the same date. I would like to aggregate those into one line in the table.

      The JSON represents individual lines that have been received in, and put away into a location. Here there are 2 identical items:

      [{"StockCode": "ABC123", "Qty": "200", "Bin": "E4B4_L", "LPN": "1234", "PutDate": "2022-01-21 18:35:53 UTC"}, {"StockCode": "ABC123", "Qty": "400", "Bin": "E4B4_L", "LPN": "1234", "PutDate": "2022-01-21 18:36:43 UTC"}]
      

      So:

      ABC | 200 |  2022-01-21 00:00:00.000 | LPN1234
      ABC | 400 |  2022-01-21 00:00:00.000 | LPN1234
      

      Should aggregate to:

      ABC | 600 |  2022-01-21 00:00:00.000 | LPN1234
      

      I tried to GROUP BY and SUM on the qty but the resulting lines in the table are still 2 separate lines. I realized the PutDate were not the same due to the timestamp, so I thought for sure casting it to DATE would solve it, but it did not.

      SQL script:

      ALTER Procedure spc_PutAway
      (@json NVARCHAR(MAX) = '')
      AS
      BEGIN
      

      INSERT INTO warehouse (Bin, StockCode, Qty, PutDate, VerDate, LPN)
      SELECT Bin, StockCode, Sum(Qty) as Qty, CAST(PutDate AS DATE) as PutDate, GETDATE() as VerDate, LPN
      FROM OPENJSON(@json)
      WITH (
      Bin VARCHAR(20) '$.Bin',
      StockCode VARCHAR(30) '$.StockCode',
      Qty DECIMAL(18,6) '$.Qty',
      PutDate VARCHAR(20) '$.PutDate',
      LPN VARCHAR(50) '$.LPN'
      )
      WHERE Bin <> 'DEFAULT'
      GROUP BY StockCode, Bin, PutDate, LPN

      END

      Results in table:

      Bin StockCode   Qty PutDate VerDate LPN TransID VerCol
      E4B4_L  ABC123  200.000000  2022-01-21 00:00:00.000 2022-01-21 10:52:43.823 1234    1   0x000000000000275D
      E4B4_L  ABC123  400.000000  2022-01-21 00:00:00.000 2022-01-21 10:52:43.823 1234    2   0x000000000000275E
      
      posted in SQL
      S
      shizuka
    • Backing up a DB on remote server

      I need to backup a postgres DB which is located on a remote server. This server also hosts a lot of other stuff, and that's why I don't have a general access to the server itself. Therefore I believe that I can not use ssh to access the remote server and run pg_dump from there. (Please correct me if I am wrong -regarding pg_dump ) The only thing I can do is write and read the DB via a DB connection (which is allowed through the firewall.)

      What is the best approach of backing up such DB?

      I tried pg_dump, but I don't think it'll work under such conditions.

      posted in SQL
      S
      shizuka
    • Migrating from old AlwaysOn Availability Group to new AG cluster but keep ip addresses

      We want to migrate our SQL Server 2012 on server 2012R2 to SQL Server 2019 on server 2016 and keep the ip addresses but change hostnames.

      i can not change the name of listener and ip addresses from application code. We have 10 TB data. We have been using single availability group.

      Current situation :

      LISTENER : SQLLSN : 10.10.12.12
      

      SQL_OLD1 - 10.10.12.2
      SQL_OLD2 - 10.10.12.3
      SQL_OLD3 - 10.10.12.4

      Windows Server 2016 - SQL Server 2019

      SQL_NEW1 - 10.10.12.5
      SQL_NEW2 - 10.10.12.6
      SQL_NEW3 - 10.10.12.7

      After migration:

      LISTENER : SQLLSN : 10.10.12.12
      

      SQL_NEW1 - 10.10.12.2
      SQL_NEW2 - 10.10.12.3
      SQL_NEW3 - 10.10.12.4

      Any information and suggestion would be very appreciated.

      posted in SQL
      S
      shizuka
    • 1
    • 2
    • 3
    • 4
    • 5
    • 1471
    • 1472
    • 1 / 1472