Navigation

    SOFTWARE TESTING

    • Register
    • Login
    • Search
    • Job Openings
    • Freelance Jobs
    • Companies
    • Conferences
    • Courses
    1. Home
    2. chanisef
    3. Posts
    C
    • Profile
    • Following
    • Followers
    • Topics
    • Posts
    • Best
    • Groups

    Posts made by chanisef

    • RE: Get count of unmapped records when using a mapping table

      I got it to work using not exists and a subquery.

      SELECT
          COUNT(pr.id) AS product_count
      FROM
          products pr
      WHERE NOT EXISTS(
          SELECT
              *
          FROM
              vendor_product_map vp
          WHERE
              vp.product = pr.id AND vp.vendor = 3
      )
      
      posted in SQL
      C
      chanisef
    • Does adding a non clustered index to foreign key speed up the join?

      I have 2 large tables.

      Sample query:

      SELECT USERNAME,S.ProductName,S.Amt,S.Date
      FROM USERS U
      JOIN SALES S
      ON U.ID=S.UID
      

      Presently the Users.ID and Sales.ID are clustered indexes.

      So as to speed up the JOIN, will it help if I add a non-clustered index on Sales.UID? And add the following columns as included column: ProductName, Amt and Date?

      posted in SQL
      C
      chanisef
    • RE: How do I set up row level security based upon relationship in another table in PostgreSQL?

      Ok, so I figured out a way to do this. I don't have a lot of experience in this field, so there's probably better ways, but here it is.

      The first is to set a policy that evaluates visibility based on presence of a matching row in user_connections for a given expression. The USING expression for the policy ended up looking like this:

      (
          (uid() = id) OR 
          (
              ( SELECT count(*) AS count
                  FROM user_connections
                  WHERE (
                      user_connections.user_a_confirmed AND
                      user_connections.user_b_confirmed AND
                      (
                          (user_connections.user_a_id = users.id) OR 
                          (user_connections.user_b_id = users.id)
                      ) AND
                      (
                          (user_connections.user_a_id = users.id) OR 
                          (user_connections.user_b_id = users.id)
                      )
                  )
              ) >= 1
          )
      )
      

      The second is to set up a publicly accessible view for users that filters down visible columns. I did this via:

      CREATE OR REPLACE VIEW users_public AS
          SELECT id, display_name
          FROM users;
      
      posted in SQL
      C
      chanisef
    • SQL (Postgres): query array Column which must be a subset of an array

      I have a table "article" and for each entry there are some "tags" stored (array column)

      name   |   tags
      

      test1 | {t1,t3}
      test2 | {t2,t3}
      test3 | {t3}
      test4 | {t1,t5}
      test5 | {t1,t3}
      test6 | {t2,t3}

      I can query for tags which passed as an array:

      SELECT name, tags FROM article WHERE tags && array['t1','t3','t10']::_varchar

      This is working but I actually want to achieve that just entries will be returned when all tags of the entry are in the passed array. (not just 1)

      enter image description here

      posted in SQL
      C
      chanisef
    • RE: I have to write query based on cases can i write like this

      The case construct has to have an "end".

      CASE 
      WHEN condition1 THEN value1
      WHEN condition2 THEN value2
      END [column-alias] 
      

      So, the first few lines of your query might look more like this:

      SELECT DISTINCT 
        AD_MA_CODE 
      , AD_PA_CODE 
      , AD_EN_CODE_PSV 
      , AD_CODE 
      , CASE 
        WHEN AD_EN_CODE_PSV = ID_EN_CODE_PSV AND AD_CODE = ID_AD_CODE 
          THEN '1' 
        WHEN AD_CODE_POSTAL=ZL_CODE_POSTAL AND AD_VILLE=ZL_VILLE 
          THEN AD_ADR_DEFAUT
        END c1
      , AD_FACTURATION_VN 
      . . . 
      

      I would also caution against the "blind" use of select distinct.
      If you're getting "duplicate" rows in your query, then removing the cause of that duplication (often an incorrect join condition) is a better choice then trying to get the database to try and resolve them out afterwards.
      This can have a huge negative performance impact, especially on resultsets with a large number of columns (distinct a, b, c can perform as poorly as group by a, b, c order by a, b, c - and you've a lot more than three columns there!).

      posted in SQL
      C
      chanisef
    • How to convert json array to json object in PostgreSQL?

      This is My Column cmt_json_value which has values which is of type json array.

      [{"name": "Pending", "value": "PENDING"}, {"name": "Error", "value": "ERROR"},{"name":"Complete", "value":"COMPLETE"},{"name":"In-Progress", "value":"IN_PROGRESS"}]
      

      I want to Write a Postgresql query to fetch name and value as column from table configuration_matrix.

      my existing query is:-

      select cmt_json_value ->>'name' as name , cmt_json_value ->> 'value' as value
      from configuration_matrix
      where
      cmt_category = 'LIST_OF_VALUES' and
      cmt_key = 'JOB_STATUS'
      order by cmt_json_value ->> 'name' asc;
      
      posted in SQL
      C
      chanisef
    • DAX Calculate the sum of values for a year with start date and end date

      I can't manage in DAX, to calculate for a year, the sum of the values which are in this year, I would like the proportion of each of the values in a year according to its starting date and its ending date.

      example

      For example, I want to be able to calculate for each year, the value of the "workload" during that year, so in 2019 the share of 150 between 01/02/2019 and the end of the year then the share of 150 between 01/01/2020 and 12/31/2020 then the share between 01/01/2021 and 05/05/2021 (without forgetting the phase 1-2 and the other IDs obviously).

      posted in SQL
      C
      chanisef
    • How I can update one database schema to another database?

      I have 2 databases in the same postgresql server:

      • mybd: used for development
      • test_mydb: used for testing is is preseeded with some test data that are significantly smaller than mydb

      The mydb is updated via migration scripts. What I want is to sync the changes after the migrations are run into the test_mydb afterwards, I want to pg_dump the contents of test_mydb in order to generate a test dump.

      Is there a way to look for diferences between 2 schemas of the database mydb and test_mydb and apply them to test_mydb?

      I want schema-only differences and not full data replication.

      posted in SQL
      C
      chanisef
    • RE: SQL Data Masking a copy of data for backup

      YES, they will read the data in clear text!

      You have 2 options (+ a bonus) to archive what you want:

      Dynamic Data Masking

      https://docs.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking?view=sql-server-ver15 : which is only applied to your SQL Server instance. As soon as you take a backup of it to a .bak file and you send it to the vendor the Dynamic Data Masking doesn't even comes with it. The vendor doesn't even has to create a new user o change the ownership. His own user will already have full access.

      The way you should use Dynamic Data Masking is https://www.sqlshack.com/using-dynamic-data-masking-in-sql-server-2016-to-protect-sensitive-data/#:%7E:text=Dynamic%20Data%20Masking%20is%20a,data%20at%20the%20database%20layer.&text=This%20feature%20requires%20no%20coding,data%20stored%20in%20the%20disk. : you keep the database on your side and you create a user for the vendor:

      CREATE USER DDMUser WITHOUT LOGIN;  
      GRANT SELECT ON Employee_Financial TO DDMUser;
      

      you then mask a column:

      ALTER TABLE Employee_Financial  
      ALTER COLUMN EMP_Last_Name varchar(10) MASKED WITH (FUNCTION = 'default()'); 
      

      You then select that column as the user to see what they will see:

      EXECUTE AS USER = 'DDMUser';  
      SELECT * FROM Employee_Financial;  
      REVERT; 
      

      And as you are selecting those data as the user DDMUser you will not see that specific column.

      You now open a port for the vendor and you can allow them to connect to your database because they will not see the data you have masked.

      Static Data Masking

      https://www.mssqltips.com/sqlservertip/5939/sql-server-static-data-masking-example/ is only available in preview in SQL Server 2019 and Azure SQL Database.

      In that case you can mask the database and send it to the vendor.

      Bonus

      Lately I have extensively https://www.jeeja.biz/blog/ about how to setup a static data anonymization for SQL Server.

      After a few months I have to admit it: it's a pool of blood. I'm the first person who tells you: don't go this way and keep this option as the very last chance. Focus on Dynamic or Static data masking provided by Microsoft out-of-the-box.

      posted in SQL
      C
      chanisef
    • RE: Is there a way to install PostgreSQL using apt-get without it running initdb automatically?

      IIRC there is no way with apt directly, but you can more manually as in the answer to this question: https://askubuntu.com/a/482936

      The process would be:

      1. Install the dependencies
      2. Get and unpack the package with apt-get download ; sudo dpkg --unpack *.deb
      3. Edit /var/lib/dpkg/info/.postinst (the linked answer just deletes it, but there may be other tasks you still require it to perform)
      4. Complete the instead with sudo dpkg --configure ; sudo apt-get install -yf #To verify/fix dependencies

      That final step might be sufficient to allow you to skip installing dependencies beforehand in step 1, if that it the case then there is also the advantage that they'll be automatically removed if you remove pg and nothing else requires them.

      A manual approach like this might be fine for a one-off, and personally when upgrades happen the post-install script won't see the need to run something like initdb as things are already configured being the point where that is needed. If you need this a lot, as part of a product install perhaps, then it could be cumbersome, the other options then are to have a script to undo what initdb does after the fact, or maintain your own apt repo containing modified versions of this package (which in itself is a chunk of admin).

      posted in SQL
      C
      chanisef
    • RE: How to insert various records from multiple tables with a condition

      In theory what I want should look like this:

      | StudentID | StaffID | Phone_Model | Class |
      | --------- | ------- |------------ | ----- |
      |  1        | 1       | iPhone7     | S201
      |  2        | 3       | iPhone11    | S203
      

      Is there something missing from my SQL statement?

      You are missing relation between Staff and Student. You need a column StaffID int in the Student table.

      create table  Staff  (
      StaffID int ,
      Class varchar(10),
      TotalPhonesConfiscated int );
      

      insert into Staff values
      (1,'S201',1),
      (2,'S202',0),
      (3,'S203',1);

      create table Student (
      StudentID int ,
      Phone_Model varchar(20),
      StaffID int );

      insert into Student values
      (1,'iPhone7',1),
      (2,'Samsung',3);

      create table Phones (
      StudentID int,
      StaffID int ,
      Phone_Model varchar(20),
      Class varchar(10));

      Then you can insert the data:

      INSERT INTO Phones ( StudentID, 
                           StaffID, 
                           Phone_Model, 
                           Class
                         )
      SELECT   StudentID, 
               Student.StaffID, 
               Phone_Model, 
               Class
      FROM Student 
      INNER JOIN  Staff on Student.StaffID=Staff.StaffID
      WHERE TotalPhonesConfiscated >= 1;
      

      select * from Phones;

      Would result on:

      Result:

      StudentID StaffID Phone_Model   Class
      1           1       iPhone7     S201
      2           3       Samsung     S203
      
      posted in SQL
      C
      chanisef
    • How to handle incoming time-series data in a data warehouse?

      I am new to data warehousing (having learnt this in my school days) and is looking to do a data warehouse as a side project. Below is a simple data warehouse design I came up with: enter image description here

      The data warehouse have 2 dimension tables and 1 fact table. 1 dimension table contains datetime data and the other contain the device data. The fact table contains the device incoming data values captured at the field. The data granularity in the fact table is 5 minutes.

      I am confused on how time-series is being handled and will appreciate if someone can clarify this. Assuming one row of data coming from the device looks like this:

          datetime        drive_a   drive_b   drive_c   shaft_a   shaft_b   shaft_c   total_output
      02/01/2022 13:05      4.2       3.2       7.4       5.3       8.2       6.4        4563.2
      

      This will be processed and stored into the fact table in the data warehouse.

      How do I handle the datetime column from this incoming data since the dateKey is not the same format as the datetime incoming data?

      I am thinking that the fact table need another column called dateTime_raw, but that defeat the purpose of a dim_datetime table isn't it since my datetime is already in my fact table?

      P.S: Sorry if my question is confusing; trying my best to explain since I am not proficient in this field.

      posted in SQL
      C
      chanisef
    • RE: How to debug "General error: 1030 Got error 0 "Internal error/check (Not system error)" from storage engine Aria

      MariaDB-10.5.13 has this https://github.com/MariaDB/server/commit/f03fee06b0 that potentially improved the error message.

      You could try https://mariadb.com/kb/en/check-table/ on the underlying tables though I suspect if there was errors here they would have shown up in the mariadb error log.

      To test if 10.5.13 is released has fixed this error (unlikely, can't see anything in the https://mariadb.com/kb/en/mariadb-10513-release-notes/ or change log), or provides a more detailed error, you can run that mariadb:10.5.13 container on your same data directory and execute the same query.

      If you want to check if the soon to be released 10.5.14 release fixes it, you can use the quay.io/mariadb-foundation/mariadb-devel:10.5 image (ref https://mariadb.org/new-service-quay-io-mariadb-foundation-mariadb-devel/ ).

      Otherwise I suggest searching and/or https://jira.mariadb.org/ including your configuration and table/view structures potentially with explain {query}.

      If you really want to help out the MariaDB developers, obtain a backtrace with the https://mariadb.org/mariadb-debug-container/ , set a breakpoint with b mysql_errno_to_sqlstate before r, and obtain the full stacktrace where it occurred.

      posted in SQL
      C
      chanisef
    • Very slow large stored procedure

      I have a rather large stored procedure that processes over a hundred thousand records. This SP has been running beautifully until last week when the database server was hit with a ransomware attack. By a HUGE stroke of luck I took an offsite backup of the entire database the night before for testing. Historically the SP ran in under 3 minutes.

      We have a new server, same OS, same MySQL version(5.7) same processors and memory(It's a VM) as the corrupted server. The only thing that was lost is the my.ini. IT didn't have the database server in their backup plan. IT found an older copy of the my.ini file which I ran with. The stored proc in now EXTREMELY slow, like a calculated 30 hours to run rather than 3 minutes. Barring something wrong with the VM setup I'm doubting the my.ini I was given.

      Pertinent server specs(Where have I gone wrong?):

      • OS: Windows server 2012 R2 64 Bit

      • Available RAM: 9.41G

      • Available Virtual Memory: 6.67 GB

      • Processor:Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz, 2098 Mhz, 1 Core(s), 1 Logical Processor(s)

      • Page File Space: 1.81 GB

      my.ini:

      # Other default tuning values
      # MySQL Server Instance Configuration File
      # ----------------------------------------------------------------------
      # Generated by the MySQL Server Instance Configuration Wizard
      #
      #
      # Installation Instructions
      # ----------------------------------------------------------------------
      #
      # On Linux you can copy this file to /etc/my.cnf to set global options,
      # mysql-data-dir/my.cnf to set server-specific options
      # (@localstatedir@ for this installation) or to
      # ~/.my.cnf to set user-specific options.
      #
      # On Windows you should keep this file in the installation directory 
      # of your server (e.g. C:\Program Files\MySQL\MySQL Server X.Y). To
      # make sure the server reads the config file use the startup option 
      # "--defaults-file". 
      #
      # To run run the server from the command line, execute this in a 
      # command line shell, e.g.
      # mysqld --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini"
      #
      # To install the server as a Windows service manually, execute this in a 
      # command line shell, e.g.
      # mysqld --install MySQLXY --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini"
      #
      # And then execute this in a command line shell to start the server, e.g.
      # net start MySQLXY
      #
      #
      # Guildlines for editing this file
      # ----------------------------------------------------------------------
      #
      # In this file, you can use all long options that the program supports.
      # If you want to know the options a program supports, start the program
      # with the "--help" option.
      #
      # More detailed information about the individual options can also be
      # found in the manual.
      #
      # For advice on how to change settings please see
      # http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html
      #
      #
      # CLIENT SECTION
      # ----------------------------------------------------------------------
      #
      # The following options will be read by MySQL client applications.
      # Note that only client applications shipped by MySQL are guaranteed
      # to read this section. If you want your own MySQL client program to
      # honor these values, you need to specify it as an option during the
      # MySQL client library initialization.
      #
      [client]
      

      pipe

      socket=0.0

      port=3306

      [mysql]
      no-beep

      default-character-set=utf8

      SERVER SECTION

      ----------------------------------------------------------------------

      The following options will be read by the MySQL Server. Make sure that

      you have installed the server correctly (see above) so it reads this

      file.

      server_type=2

      [mysqld]

      The next three options are mutually exclusive to SERVER_PORT below.

      skip-networking

      enable-named-pipe

      shared-memory

      shared-memory-base-name=MYSQL

      The Pipe the MySQL Server will use

      socket=MYSQL

      The TCP/IP Port the MySQL Server will listen on

      port=3306

      wait_timeout=2147483

      Path to installation directory. All paths are usually resolved relative to this.

      basedir="C:/Program Files/MySQL/MySQL Server 5.7/"

      Path to the database root

      datadir=C:/ProgramData/MySQL/MySQL Server 5.7/Data

      The default character set that will be used when a new schema or table is

      created and no character set is defined

      character-set-server=utf8

      The default storage engine that will be used when create new tables when

      default-storage-engine=INNODB

      Set the SQL mode to strict

      sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"

      * Query Cache Configuration

      query_cache_limit = 2M
      query_cache_min_res_unit = 512k
      query_cache_size = 200M

      General and Slow logging.

      log-output=FILE
      general-log=0
      general_log_file="PUPIL-PBP.log"
      slow-query-log=1
      slow_query_log_file="PUPIL-PBP-slow.log"
      long_query_time=10

      Binary Logging.

      log-bin

      Error Logging.

      log-error="PUPIL-PBP.err"

      Server Id.

      server-id=1

      Secure File Priv.

      secure-file-priv="C:/ProgramData/MySQL/MySQL Server 5.7/Uploads"

      The maximum amount of concurrent sessions the MySQL server will

      allow. One of these connections will be reserved for a user with

      SUPER privileges to allow the administrator to login even if the

      connection limit has been reached.

      max_connections=151

      The number of open tables for all threads. Increasing this value

      increases the number of file descriptors that mysqld requires.

      Therefore you have to make sure to set the amount of open files

      allowed to at least 4096 in the variable "open-files-limit" in

      section [mysqld_safe]

      table_open_cache=2000

      Maximum size for internal (in-memory) temporary tables. If a table

      grows larger than this value, it is automatically converted to disk

      based table This limitation is for a single table. There can be many

      of them.

      tmp_table_size=512M

      How many threads we should keep in a cache for reuse. When a client

      disconnects, the client's threads are put in the cache if there aren't

      more than thread_cache_size threads from before. This greatly reduces

      the amount of thread creations needed if you have a lot of new

      connections. (Normally this doesn't give a notable performance

      improvement if you have a good thread implementation.)

      thread_stack = 192K
      thread_cache_size = 100

      #*** MyISAM Specific options

      The maximum size of the temporary file MySQL is allowed to use while

      recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE.

      If the file-size would be bigger than this, the index will be created

      through the key cache (which is slower).

      myisam_max_sort_file_size=100G

      If the temporary file used for fast index creation would be bigger

      than using the key cache by the amount specified here, then prefer the

      key cache method. This is mainly used to force long character keys in

      large tables to use the slower key cache method to create the index.

      myisam_sort_buffer_size=8M

      Size of the Key Buffer, used to cache index blocks for MyISAM tables.

      Do not set it larger than 30% of your available memory, as some memory

      is also required by the OS to cache rows. Even if you're not using

      MyISAM tables, you should still set it to 8-64M as it will also be

      used for internal temporary disk tables.

      key_buffer_size=512M

      Size of the buffer used for doing full table scans of MyISAM tables.

      Allocated per thread, if a full scan is needed.

      read_buffer_size=0

      read_rnd_buffer_size=0

      #*** INNODB Specific options ***

      innodb_data_home_dir=0.0

      Use this option if you have a MySQL server with InnoDB support enabled

      but you do not plan to use it. This will save memory and disk space

      and speed up some things.

      skip-innodb

      If set to 1, InnoDB will flush (fsync) the transaction logs to the

      disk at each commit, which offers full ACID behavior. If you are

      willing to compromise this safety, and you are running small

      transactions, you may set this to 0 or 2 to reduce disk I/O to the

      logs. Value 0 means that the log is only written to the log file and

      the log file flushed to disk approximately once per second. Value 2

      means the log is written to the log file at each commit, but the log

      file is only flushed to disk approximately once per second.

      innodb_flush_log_at_trx_commit=0

      The size of the buffer InnoDB uses for buffering log data. As soon as

      it is full, InnoDB will have to flush it to disk. As it is flushed

      once per second anyway, it does not make sense to have it very large

      (even with long transactions).

      innodb_log_buffer_size=256M

      InnoDB, unlike MyISAM, uses a buffer pool to cache both indexes and

      row data. The bigger you set this the less disk I/O is needed to

      access data in tables. On a dedicated database server you may set this

      parameter up to 80% of the machine physical memory size. Do not set it

      too large, though, because competition of the physical memory may

      cause paging in the operating system. Note that on 32bit systems you

      might be limited to 2-3.5G of user level memory per process, so do not

      set it too high.

      innodb_buffer_pool_size=4G

      Size of each log file in a log group. You should set the combined size

      of log files to about 25%-100% of your buffer pool size to avoid

      unneeded buffer pool flush activity on log file overwrite. However,

      note that a larger logfile size will increase the time needed for the

      recovery process.

      innodb_log_file_size=1G

      Number of threads allowed inside the InnoDB kernel. The optimal value

      depends highly on the application, hardware as well as the OS

      scheduler properties. A too high value may lead to thread thrashing.

      innodb_thread_concurrency=8

      The increment size (in MB) for extending the size of an auto-extend InnoDB system tablespace file when it becomes full.

      innodb_autoextend_increment=64

      The number of regions that the InnoDB buffer pool is divided into.

      For systems with buffer pools in the multi-gigabyte range, dividing the buffer pool into separate instances can improve concurrency,

      by reducing contention as different threads read and write to cached pages.

      innodb_buffer_pool_instances=8

      Determines the number of threads that can enter InnoDB concurrently.

      innodb_concurrency_tickets=5000

      Specifies how long in milliseconds (ms) a block inserted into the old sublist must stay there after its first access before

      it can be moved to the new sublist.

      innodb_old_blocks_time=1000

      It specifies the maximum number of .ibd files that MySQL can keep open at one time. The minimum value is 10.

      innodb_open_files=300

      When this variable is enabled, InnoDB updates statistics during metadata statements.

      innodb_stats_on_metadata=0

      When innodb_file_per_table is enabled (the default in 5.6.6 and higher), InnoDB stores the data and indexes for each newly created table

      in a separate .ibd file, rather than in the system tablespace.

      innodb_file_per_table=1

      Use the following list of values: 0 for crc32, 1 for strict_crc32, 2 for innodb, 3 for strict_innodb, 4 for none, 5 for strict_none.

      innodb_checksum_algorithm=0

      The number of outstanding connection requests MySQL can have.

      This option is useful when the main MySQL thread gets many connection requests in a very short time.

      It then takes some time (although very little) for the main thread to check the connection and start a new thread.

      The back_log value indicates how many requests can be stacked during this short time before MySQL momentarily

      stops answering new requests.

      You need to increase this only if you expect a large number of connections in a short period of time.

      back_log=80

      If this is set to a nonzero value, all tables are closed every flush_time seconds to free up resources and

      synchronize unflushed data to disk.

      This option is best used only on systems with minimal resources.

      flush_time=0

      The minimum size of the buffer that is used for plain index scans, range index scans, and joins that do not use

      indexes and thus perform full table scans.

      join_buffer_size=256K

      The maximum size of one packet or any generated or intermediate string, or any parameter sent by the

      mysql_stmt_send_long_data() C API function.

      max_allowed_packet=32M

      If more than this many successive connection requests from a host are interrupted without a successful connection,

      the server blocks that host from performing further connections.

      max_connect_errors=100

      Changes the number of file descriptors available to mysqld.

      You should try increasing the value of this option if mysqld gives you the error "Too many open files".

      open_files_limit=4161

      If you see many sort_merge_passes per second in SHOW GLOBAL STATUS output, you can consider increasing the

      sort_buffer_size value to speed up ORDER BY or GROUP BY operations that cannot be improved with query optimization

      or improved indexing.

      sort_buffer_size=256K

      The number of table definitions (from .frm files) that can be stored in the definition cache.

      If you use a large number of tables, you can create a large table definition cache to speed up opening of tables.

      The table definition cache takes less space and does not use file descriptors, unlike the normal table cache.

      The minimum and default values are both 400.

      table_definition_cache=1400

      Specify the maximum size of a row-based binary log event, in bytes.

      Rows are grouped into events smaller than this size if possible. The value should be a multiple of 256.

      binlog_row_event_max_size=8K

      If the value of this variable is greater than 0, a replication slave synchronizes its master.info file to disk.

      (using fdatasync()) after every sync_master_info events.

      sync_master_info=10000

      If the value of this variable is greater than 0, the MySQL server synchronizes its relay log to disk.

      (using fdatasync()) after every sync_relay_log writes to the relay log.

      sync_relay_log=10000

      If the value of this variable is greater than 0, a replication slave synchronizes its relay-log.info file to disk.

      (using fdatasync()) after every sync_relay_log_info transactions.

      sync_relay_log_info=10000

      Load mysql plugins at start."plugin_x ; plugin_y".

      plugin_load="mysqlx"

      MySQL server's plugin configuration.

      loose_mysqlx_port=33060

      posted in SQL
      C
      chanisef
    • RE: Select INTs from start of string separated by a character (pipe). Update other columns using these INTs

      Given the following table:

      CREATE TABLE dbo.T
      (
          [Description] varchar(8000) NOT NULL,
          T1 integer NULL,
          T2 integer NULL
      );
      

      and data:

      INSERT dbo.T
          ([Description])
      VALUES
          ('| 30 | 30 | INTERNAL AUDIT | RL | OK'),
          ('| SE+17.5D  CYL 1.25'),
          ('| 10 | 11 | INTERNAL AUDIT | MM | CORRECTED'),
          ('| 5 | 5 | INTERNAL AUDIT | JY | GOOD'),
          ('| 56 | 56 | INTERNAL AUDIT | JMS | OK'),
          ('| 10 | 10 | INTERNAL AUDIT | CN | None'),
          ('| 3 | 3 | INTERNAL AUDIT | MG | GOOD'),
          ('| 46 | 47 | INTERNAL AUDIT | AB | None'),
          ('| 23 | 23 | INTERNAL AUDIT | BA | OK'),
          ('| 30 | 30 | INTERNAL AUDIT | RL | OK'),
          ('| 25 | 29 | INTERNAL AUDIT | KV | CORRECTED'),
          ('| 4 | 3 | INTERNAL AUDIT | KV | PULLING, LIVE AR'),
          ('| ref # - 0006-4121-02');
      

      Solution

      The following commented code uses an updatable cursor:

      SET XACT_ABORT, NOCOUNT ON;
      SET STATISTICS XML OFF;
      

      DECLARE
      @Description varchar(8000),
      @Pipe1Position integer,
      @Pipe2Position integer,
      @T1 integer,
      @T2 integer;

      DECLARE DataCursor CURSOR LOCAL
      SCROLL DYNAMIC SCROLL_LOCKS
      FOR SELECT [Description] FROM dbo.T
      FOR UPDATE OF T1, T2;

      OPEN DataCursor;

      -- Get the first row
      FETCH FIRST FROM DataCursor INTO @Description;

      WHILE @@FETCH_STATUS = 0
      BEGIN
      -- Reset
      SET @T1 = NULL;
      SET @T2 = NULL;

      -- Find the pipe after the first one
      SET @Pipe1Position = CHARINDEX('|', @Description, 2);
      
      IF @Pipe1Position > 0
      BEGIN TRY
          -- Extract the first integer
          SET @T1 = CONVERT(integer, 
              SUBSTRING(@Description, 2, @Pipe1Position - 2));
      
          -- Find the the next pipe
          SET @Pipe2Position = CHARINDEX('|', @Description, @Pipe1Position + 1);
      
          IF @Pipe2Position > 0
          BEGIN
              -- Extract the second integer
              SET @T2 = CONVERT(integer, 
                  SUBSTRING(@Description, @Pipe1Position + 1, @Pipe2Position - @Pipe1Position - 1));            
          END;
      END TRY
      BEGIN CATCH
      END CATCH;
      
      -- Perform the update for this row
      UPDATE dbo.T
      SET T1 = @T1, T2 = @T2
      WHERE CURRENT OF DataCursor;
      
      -- Next row
      FETCH NEXT FROM DataCursor INTO @Description;
      

      END;

      CLOSE DataCursor;
      DEALLOCATE DataCursor;

      The final state of the table is:

      Description T1 T2
      | 30 | 30 | INTERNAL AUDIT | RL | OK 30 30
      | SE+17.5D CYL 1.25 NULL NULL
      | 10 | 11 | INTERNAL AUDIT | MM | CORRECTED 10 11
      | 5 | 5 | INTERNAL AUDIT | JY | GOOD 5 5
      | 56 | 56 | INTERNAL AUDIT | JMS | OK 56 56
      | 10 | 10 | INTERNAL AUDIT | CN | None 10 10
      | 3 | 3 | INTERNAL AUDIT | MG | GOOD 3 3
      | 46 | 47 | INTERNAL AUDIT | AB | None 46 47
      | 23 | 23 | INTERNAL AUDIT | BA | OK 23 23
      | 30 | 30 | INTERNAL AUDIT | RL | OK 30 30
      | 25 | 29 | INTERNAL AUDIT | KV | CORRECTED 25 29
      | 4 | 3 | INTERNAL AUDIT | KV | PULLING, LIVE AR 4 3
      | ref # - 0006-4121-02 NULL NULL

      Online demo

      https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=8425325a6e3940eaabb5f69f8004c5b5

      Commentary

      The database is not the best place to this kind of work. The pipe-delimited data looks as if it originated in an external system, and was imported from a flat file.

      Instead of importing the data as-is then processing inside SQL Server, it would be better to use a dedicated ETL tool like SSIS to transform the flat-file data directly then load into one or more relational database tables.

      A cursor won't be ideal if you have tens of millions of rows to process in a hurry, but it does make showing the logic used clear. It performs pretty well all things considered.

      You should also look into getting onto a more modern version of SQL Server. Things like STRING_SPLIT and TRY_CONVERT make coding this sort of thing a lot easier.

      posted in SQL
      C
      chanisef
    • RE: NONCLUSTERED index seek on mslid is slow

      I'd be interested in the number of total rows you have in your Parts.ManufacturingData table and what the total runtime currently is for your query. Does the number of rows the execution plan says it's returning (roughly 32 million) actually make sense for your query?

      Maybe you'll find a filtered index saves you a little bit of time since it'll pre-materialize only the data you want. How about an index with this definition, does it make any difference?

      CREATE NONCLUSTERED INDEX IX_ManufacfuringData_MSLID_Filtered ON Parts.ManufacturingData (MSLID) WHERE MSLID IS NOT NULL;
      

      Note you shouldn't need to do anything different to your query to use the above filtered index once it's created. But you should check the execution plan to ensure the optimizer chose this new index over any other indexes on your table when the query runs.

      If that doesn't make any difference, the other thing you can try is adding the https://www.mssqltips.com/sqlservertip/3296/why-the-sql-server-forcescan-hint-exists/ hint to your query like so:

      select  partid,mslid 
      into ExtractReports.dbo.manufactureparts
      from parts.manufacturingdata m with(nolock, FORCESCAN)
      where mslid is  not null
      

      This would tell the optimizer to use a scan operation instead of seek against your data. This would generally be more performant if roughly a majority of your data in the table meets the criteria of your WHERE clause. I.e. it's generally faster to scan the entire table at that point and filter out the unwanted rows, than to seek against so many rows. But hard to say if this'll help your circumstances without knowing your data or testing it.

      Please note https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-table?view=sql-server-ver15 should be used cautiously and only in circumstances where alternative optimization methods are not possible. Some query hints limit the number of available execution plans that the optimizer can choose from, and therefore can result in an error being thrown when certain queries using those hints try to execute.

      In this case, I think using the FORCESCAN hint is likely ok, as your query is simple, and I don't believe it limits the number of query plans as much as other hints do.

      posted in SQL
      C
      chanisef
    • I get the error "(psycopg2.OperationalError) FATAL: role "wsb" does not exist", but the user does exits

      I am trying connect to my postgress database using SQLAlchemy. I was working fine yesterday (01/27/22). Now I get the following error:

      sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: role "wsb" does not exist

      Only the user does exist.

                                         List of roles
       Role name |                         Attributes                         | Member of 
      -----------+------------------------------------------------------------+-----------
       wsb       | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
       wsb2      |                                                            | {}
      

      As you can see I tried making another user "wsb2", but that didn't work either. Below is my script:

      db_string = "postgresql://wsb:wsb@localhost:5432/wsb"
      

      db = create_engine(db_string)
      meta = MetaData()

      submissions_table = Table('wsb_submissions', meta,
      Column('index', INTEGER),
      Column('submission_id', TEXT),
      Column('title', TEXT),
      Column('created', TIMESTAMP),
      Column('author', TEXT),
      Column('ticker_mentioned', TEXT),
      Column('subreddit', TEXT),
      Column('emotes', ARRAY(TEXT)),
      Column('rocket_emotes', INTEGER)
      )

      comments_table = Table('wsb_comments', meta,
      Column('index', INTEGER),
      Column('comment_id', INTEGER),
      Column('body', TEXT),
      Column('created', TIMESTAMP),
      Column('author', TEXT),
      Column('rocket_emotes', INTEGER),
      Column('subreddit', TEXT),
      Column('emotes', ARRAY(TEXT)),
      Column('submission_key', TEXT)
      )
      db.connect()

      also here is my Dockerfile:

      FROM postgres:13.2
      

      ENV POSTGRES_PASSWORD=wsb
      ENV POSTGRES_USER=wsb
      ENV POSTGRES_DB=wsb

      COPY wsb.sql /docker-entrypoint-initdb.d/
      EXPOSE 5432

      posted in SQL
      C
      chanisef
    • How to use IN with string comma separated

      I have value in table1 like this:

      Value
      -------
      JT, KF, LF
      

      On select in other table how can I get a value like this

      Select a.Field from table2
      Inner Join Table1 on Table1.field = Table2.field
      Where Type IN(Table2.Value)
      
      posted in SQL
      C
      chanisef
    • Rolling Upgrade Win2012R2/SQL2012 Availability Group to Win2019/SQL2019

      To do a rolling upgrade of an Availability Group from Windows 2012R2, SQL 2012 to Windows 2019, SQL 2019 do I have to have an intermediary set of servers on Windows 2016? Based on the documentation it sounds like you can only go up one version (verbiage: "You can upgrade to the next version"). The only clustering component is for the Availability Group, no other roles and no services are clustered. I know SQL can go directly to 2019, it is the clustering component of the AG I am unsure about.

      posted in SQL
      C
      chanisef
    • RE: Which Index Works Better with pg_trgm Extension?

      The answer depends on the query you want to optimize.

      If the query looks like

      SELECT ...
      FROM ...
      WHERE col % 'string'
      

      a GIN index usually works better.

      If the query looks like

      SELECT ...
      FROM ...
      ORDER BY col  'string'
      LIMIT 1
      

      you need to use a GiST index, because a GIN index cannot support that.

      posted in SQL
      C
      chanisef
    • 1
    • 2
    • 3
    • 4
    • 5
    • 1491
    • 1492
    • 1 / 1492