I got it to work using not exists
and a subquery.
SELECT
COUNT(pr.id) AS product_count
FROM
products pr
WHERE NOT EXISTS(
SELECT
*
FROM
vendor_product_map vp
WHERE
vp.product = pr.id AND vp.vendor = 3
)
I got it to work using not exists
and a subquery.
SELECT
COUNT(pr.id) AS product_count
FROM
products pr
WHERE NOT EXISTS(
SELECT
*
FROM
vendor_product_map vp
WHERE
vp.product = pr.id AND vp.vendor = 3
)
I have 2 large tables.
Sample query:
SELECT USERNAME,S.ProductName,S.Amt,S.Date
FROM USERS U
JOIN SALES S
ON U.ID=S.UID
Presently the Users.ID and Sales.ID are clustered indexes.
So as to speed up the JOIN, will it help if I add a non-clustered index on Sales.UID? And add the following columns as included column: ProductName, Amt and Date?
Ok, so I figured out a way to do this. I don't have a lot of experience in this field, so there's probably better ways, but here it is.
The first is to set a policy that evaluates visibility based on presence of a matching row in user_connections for a given expression. The USING
expression for the policy ended up looking like this:
(
(uid() = id) OR
(
( SELECT count(*) AS count
FROM user_connections
WHERE (
user_connections.user_a_confirmed AND
user_connections.user_b_confirmed AND
(
(user_connections.user_a_id = users.id) OR
(user_connections.user_b_id = users.id)
) AND
(
(user_connections.user_a_id = users.id) OR
(user_connections.user_b_id = users.id)
)
)
) >= 1
)
)
The second is to set up a publicly accessible view for users that filters down visible columns. I did this via:
CREATE OR REPLACE VIEW users_public AS
SELECT id, display_name
FROM users;
I have a table "article" and for each entry there are some "tags" stored (array column)
name | tags
test1 | {t1,t3}
test2 | {t2,t3}
test3 | {t3}
test4 | {t1,t5}
test5 | {t1,t3}
test6 | {t2,t3}
I can query for tags which passed as an array:
SELECT name, tags FROM article WHERE tags && array['t1','t3','t10']::_varchar
This is working but I actually want to achieve that just entries will be returned when all tags of the entry are in the passed array. (not just 1)
The case construct has to have an "end".
CASE
WHEN condition1 THEN value1
WHEN condition2 THEN value2
END [column-alias]
So, the first few lines of your query might look more like this:
SELECT DISTINCT
AD_MA_CODE
, AD_PA_CODE
, AD_EN_CODE_PSV
, AD_CODE
, CASE
WHEN AD_EN_CODE_PSV = ID_EN_CODE_PSV AND AD_CODE = ID_AD_CODE
THEN '1'
WHEN AD_CODE_POSTAL=ZL_CODE_POSTAL AND AD_VILLE=ZL_VILLE
THEN AD_ADR_DEFAUT
END c1
, AD_FACTURATION_VN
. . .
I would also caution against the "blind" use of select distinct
.
If you're getting "duplicate" rows in your query, then removing the cause of that duplication (often an incorrect join condition) is a better choice then trying to get the database to try and resolve them out afterwards.
This can have a huge negative performance impact, especially on resultsets with a large number of columns (distinct a, b, c
can perform as poorly as group by a, b, c order by a, b, c
- and you've a lot more than three columns there!).
This is My Column cmt_json_value which has values which is of type json array.
[{"name": "Pending", "value": "PENDING"}, {"name": "Error", "value": "ERROR"},{"name":"Complete", "value":"COMPLETE"},{"name":"In-Progress", "value":"IN_PROGRESS"}]
I want to Write a Postgresql query to fetch name and value as column from table configuration_matrix.
my existing query is:-
select cmt_json_value ->>'name' as name , cmt_json_value ->> 'value' as value
from configuration_matrix
where
cmt_category = 'LIST_OF_VALUES' and
cmt_key = 'JOB_STATUS'
order by cmt_json_value ->> 'name' asc;
I can't manage in DAX, to calculate for a year, the sum of the values which are in this year, I would like the proportion of each of the values in a year according to its starting date and its ending date.
For example, I want to be able to calculate for each year, the value of the "workload" during that year, so in 2019 the share of 150 between 01/02/2019 and the end of the year then the share of 150 between 01/01/2020 and 12/31/2020 then the share between 01/01/2021 and 05/05/2021 (without forgetting the phase 1-2 and the other IDs obviously).
I have 2 databases in the same postgresql server:
mybd
: used for developmenttest_mydb
: used for testing is is preseeded with some test data that are significantly smaller than mydb
The mydb
is updated via migration scripts. What I want is to sync the changes after the migrations are run into the test_mydb
afterwards, I want to pg_dump
the contents of test_mydb
in order to generate a test dump.
Is there a way to look for diferences between 2 schemas of the database mydb
and test_mydb
and apply them to test_mydb
?
I want schema-only differences and not full data replication.
YES, they will read the data in clear text!
You have 2 options (+ a bonus) to archive what you want:
Dynamic Data Masking
https://docs.microsoft.com/en-us/sql/relational-databases/security/dynamic-data-masking?view=sql-server-ver15 : which is only applied to your SQL Server instance. As soon as you take a backup of it to a .bak
file and you send it to the vendor the Dynamic Data Masking doesn't even comes with it. The vendor doesn't even has to create a new user o change the ownership. His own user will already have full access.
The way you should use Dynamic Data Masking is https://www.sqlshack.com/using-dynamic-data-masking-in-sql-server-2016-to-protect-sensitive-data/#:%7E:text=Dynamic%20Data%20Masking%20is%20a,data%20at%20the%20database%20layer.&text=This%20feature%20requires%20no%20coding,data%20stored%20in%20the%20disk. : you keep the database on your side and you create a user for the vendor:
CREATE USER DDMUser WITHOUT LOGIN;
GRANT SELECT ON Employee_Financial TO DDMUser;
you then mask a column:
ALTER TABLE Employee_Financial
ALTER COLUMN EMP_Last_Name varchar(10) MASKED WITH (FUNCTION = 'default()');
You then select that column as the user to see what they will see:
EXECUTE AS USER = 'DDMUser';
SELECT * FROM Employee_Financial;
REVERT;
And as you are selecting those data as the user DDMUser
you will not see that specific column.
You now open a port for the vendor and you can allow them to connect to your database because they will not see the data you have masked.
Static Data Masking
https://www.mssqltips.com/sqlservertip/5939/sql-server-static-data-masking-example/ is only available in preview in SQL Server 2019 and Azure SQL Database.
In that case you can mask the database and send it to the vendor.
Bonus
Lately I have extensively https://www.jeeja.biz/blog/ about how to setup a static data anonymization for SQL Server.
After a few months I have to admit it: it's a pool of blood. I'm the first person who tells you: don't go this way and keep this option as the very last chance. Focus on Dynamic or Static data masking provided by Microsoft out-of-the-box.
IIRC there is no way with apt
directly, but you can more manually as in the answer to this question: https://askubuntu.com/a/482936
The process would be:
apt-get download ; sudo dpkg --unpack *.deb
/var/lib/dpkg/info/.postinst
(the linked answer just deletes it, but there may be other tasks you still require it to perform)sudo dpkg --configure ; sudo apt-get install -yf #To verify/fix dependencies
That final step might be sufficient to allow you to skip installing dependencies beforehand in step 1, if that it the case then there is also the advantage that they'll be automatically removed if you remove pg and nothing else requires them.
A manual approach like this might be fine for a one-off, and personally when upgrades happen the post-install script won't see the need to run something like initdb as things are already configured being the point where that is needed. If you need this a lot, as part of a product install perhaps, then it could be cumbersome, the other options then are to have a script to undo what initdb does after the fact, or maintain your own apt repo containing modified versions of this package (which in itself is a chunk of admin).
In theory what I want should look like this:
| StudentID | StaffID | Phone_Model | Class | | --------- | ------- |------------ | ----- | | 1 | 1 | iPhone7 | S201 | 2 | 3 | iPhone11 | S203
Is there something missing from my SQL statement?
You are missing relation between Staff
and Student
. You need a column StaffID int
in the Student
table.
create table Staff (
StaffID int ,
Class varchar(10),
TotalPhonesConfiscated int );
insert into Staff values
(1,'S201',1),
(2,'S202',0),
(3,'S203',1);
create table Student (
StudentID int ,
Phone_Model varchar(20),
StaffID int );
insert into Student values
(1,'iPhone7',1),
(2,'Samsung',3);
create table Phones (
StudentID int,
StaffID int ,
Phone_Model varchar(20),
Class varchar(10));
Then you can insert the data:
INSERT INTO Phones ( StudentID,
StaffID,
Phone_Model,
Class
)
SELECT StudentID,
Student.StaffID,
Phone_Model,
Class
FROM Student
INNER JOIN Staff on Student.StaffID=Staff.StaffID
WHERE TotalPhonesConfiscated >= 1;
select * from Phones;
Would result on:
Result:
StudentID StaffID Phone_Model Class 1 1 iPhone7 S201 2 3 Samsung S203
I am new to data warehousing (having learnt this in my school days) and is looking to do a data warehouse as a side project. Below is a simple data warehouse design I came up with:
The data warehouse have 2 dimension tables and 1 fact table. 1 dimension table contains datetime data and the other contain the device data. The fact table contains the device incoming data values captured at the field. The data granularity in the fact table is 5 minutes.
I am confused on how time-series is being handled and will appreciate if someone can clarify this. Assuming one row of data coming from the device looks like this:
datetime drive_a drive_b drive_c shaft_a shaft_b shaft_c total_output
02/01/2022 13:05 4.2 3.2 7.4 5.3 8.2 6.4 4563.2
This will be processed and stored into the fact table in the data warehouse.
How do I handle the datetime
column from this incoming data since the dateKey
is not the same format as the datetime
incoming data?
I am thinking that the fact table need another column called dateTime_raw
, but that defeat the purpose of a dim_datetime
table isn't it since my datetime is already in my fact table?
P.S: Sorry if my question is confusing; trying my best to explain since I am not proficient in this field.
MariaDB-10.5.13 has this https://github.com/MariaDB/server/commit/f03fee06b0 that potentially improved the error message.
You could try https://mariadb.com/kb/en/check-table/ on the underlying tables though I suspect if there was errors here they would have shown up in the mariadb error log.
To test if 10.5.13 is released has fixed this error (unlikely, can't see anything in the https://mariadb.com/kb/en/mariadb-10513-release-notes/ or change log), or provides a more detailed error, you can run that mariadb:10.5.13
container on your same data directory and execute the same query.
If you want to check if the soon to be released 10.5.14 release fixes it, you can use the quay.io/mariadb-foundation/mariadb-devel:10.5
image (ref https://mariadb.org/new-service-quay-io-mariadb-foundation-mariadb-devel/ ).
Otherwise I suggest searching and/or https://jira.mariadb.org/ including your configuration and table/view structures potentially with explain {query}
.
If you really want to help out the MariaDB developers, obtain a backtrace with the https://mariadb.org/mariadb-debug-container/ , set a breakpoint with b mysql_errno_to_sqlstate
before r
, and obtain the full stacktrace where it occurred.
I have a rather large stored procedure that processes over a hundred thousand records. This SP has been running beautifully until last week when the database server was hit with a ransomware attack. By a HUGE stroke of luck I took an offsite backup of the entire database the night before for testing. Historically the SP ran in under 3 minutes.
We have a new server, same OS, same MySQL version(5.7) same processors and memory(It's a VM) as the corrupted server. The only thing that was lost is the my.ini
. IT didn't have the database server in their backup plan. IT found an older copy of the my.ini
file which I ran with. The stored proc in now EXTREMELY slow, like a calculated 30 hours to run rather than 3 minutes. Barring something wrong with the VM setup I'm doubting the my.ini
I was given.
Pertinent server specs(Where have I gone wrong?):
OS: Windows server 2012 R2 64 Bit
Available RAM: 9.41G
Available Virtual Memory: 6.67 GB
Processor:Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz, 2098 Mhz, 1 Core(s), 1 Logical Processor(s)
Page File Space: 1.81 GB
my.ini
:
# Other default tuning values
# MySQL Server Instance Configuration File
# ----------------------------------------------------------------------
# Generated by the MySQL Server Instance Configuration Wizard
#
#
# Installation Instructions
# ----------------------------------------------------------------------
#
# On Linux you can copy this file to /etc/my.cnf to set global options,
# mysql-data-dir/my.cnf to set server-specific options
# (@localstatedir@ for this installation) or to
# ~/.my.cnf to set user-specific options.
#
# On Windows you should keep this file in the installation directory
# of your server (e.g. C:\Program Files\MySQL\MySQL Server X.Y). To
# make sure the server reads the config file use the startup option
# "--defaults-file".
#
# To run run the server from the command line, execute this in a
# command line shell, e.g.
# mysqld --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini"
#
# To install the server as a Windows service manually, execute this in a
# command line shell, e.g.
# mysqld --install MySQLXY --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini"
#
# And then execute this in a command line shell to start the server, e.g.
# net start MySQLXY
#
#
# Guildlines for editing this file
# ----------------------------------------------------------------------
#
# In this file, you can use all long options that the program supports.
# If you want to know the options a program supports, start the program
# with the "--help" option.
#
# More detailed information about the individual options can also be
# found in the manual.
#
# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html
#
#
# CLIENT SECTION
# ----------------------------------------------------------------------
#
# The following options will be read by MySQL client applications.
# Note that only client applications shipped by MySQL are guaranteed
# to read this section. If you want your own MySQL client program to
# honor these values, you need to specify it as an option during the
# MySQL client library initialization.
#
[client]
pipe
socket=0.0
port=3306
[mysql]
no-beep
default-character-set=utf8
SERVER SECTION
----------------------------------------------------------------------
The following options will be read by the MySQL Server. Make sure that
you have installed the server correctly (see above) so it reads this
file.
server_type=2
[mysqld]
The next three options are mutually exclusive to SERVER_PORT below.
skip-networking
enable-named-pipe
shared-memory
shared-memory-base-name=MYSQL
The Pipe the MySQL Server will use
socket=MYSQL
The TCP/IP Port the MySQL Server will listen on
port=3306
wait_timeout=2147483
Path to installation directory. All paths are usually resolved relative to this.
basedir="C:/Program Files/MySQL/MySQL Server 5.7/"
Path to the database root
datadir=C:/ProgramData/MySQL/MySQL Server 5.7/Data
The default character set that will be used when a new schema or table is
created and no character set is defined
character-set-server=utf8
The default storage engine that will be used when create new tables when
default-storage-engine=INNODB
Set the SQL mode to strict
sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"
* Query Cache Configuration
query_cache_limit = 2M
query_cache_min_res_unit = 512k
query_cache_size = 200M
General and Slow logging.
log-output=FILE
general-log=0
general_log_file="PUPIL-PBP.log"
slow-query-log=1
slow_query_log_file="PUPIL-PBP-slow.log"
long_query_time=10
Binary Logging.
log-bin
Error Logging.
log-error="PUPIL-PBP.err"
Server Id.
server-id=1
Secure File Priv.
secure-file-priv="C:/ProgramData/MySQL/MySQL Server 5.7/Uploads"
The maximum amount of concurrent sessions the MySQL server will
allow. One of these connections will be reserved for a user with
SUPER privileges to allow the administrator to login even if the
connection limit has been reached.
max_connections=151
The number of open tables for all threads. Increasing this value
increases the number of file descriptors that mysqld requires.
Therefore you have to make sure to set the amount of open files
allowed to at least 4096 in the variable "open-files-limit" in
section [mysqld_safe]
table_open_cache=2000
Maximum size for internal (in-memory) temporary tables. If a table
grows larger than this value, it is automatically converted to disk
based table This limitation is for a single table. There can be many
of them.
tmp_table_size=512M
How many threads we should keep in a cache for reuse. When a client
disconnects, the client's threads are put in the cache if there aren't
more than thread_cache_size threads from before. This greatly reduces
the amount of thread creations needed if you have a lot of new
connections. (Normally this doesn't give a notable performance
improvement if you have a good thread implementation.)
thread_stack = 192K
thread_cache_size = 100
#*** MyISAM Specific options
The maximum size of the temporary file MySQL is allowed to use while
recreating the index (during REPAIR, ALTER TABLE or LOAD DATA INFILE.
If the file-size would be bigger than this, the index will be created
through the key cache (which is slower).
myisam_max_sort_file_size=100G
If the temporary file used for fast index creation would be bigger
than using the key cache by the amount specified here, then prefer the
key cache method. This is mainly used to force long character keys in
large tables to use the slower key cache method to create the index.
myisam_sort_buffer_size=8M
Size of the Key Buffer, used to cache index blocks for MyISAM tables.
Do not set it larger than 30% of your available memory, as some memory
is also required by the OS to cache rows. Even if you're not using
MyISAM tables, you should still set it to 8-64M as it will also be
used for internal temporary disk tables.
key_buffer_size=512M
Size of the buffer used for doing full table scans of MyISAM tables.
Allocated per thread, if a full scan is needed.
read_buffer_size=0
read_rnd_buffer_size=0
#*** INNODB Specific options ***
innodb_data_home_dir=0.0
Use this option if you have a MySQL server with InnoDB support enabled
but you do not plan to use it. This will save memory and disk space
and speed up some things.
skip-innodb
If set to 1, InnoDB will flush (fsync) the transaction logs to the
disk at each commit, which offers full ACID behavior. If you are
willing to compromise this safety, and you are running small
transactions, you may set this to 0 or 2 to reduce disk I/O to the
logs. Value 0 means that the log is only written to the log file and
the log file flushed to disk approximately once per second. Value 2
means the log is written to the log file at each commit, but the log
file is only flushed to disk approximately once per second.
innodb_flush_log_at_trx_commit=0
The size of the buffer InnoDB uses for buffering log data. As soon as
it is full, InnoDB will have to flush it to disk. As it is flushed
once per second anyway, it does not make sense to have it very large
(even with long transactions).
innodb_log_buffer_size=256M
InnoDB, unlike MyISAM, uses a buffer pool to cache both indexes and
row data. The bigger you set this the less disk I/O is needed to
access data in tables. On a dedicated database server you may set this
parameter up to 80% of the machine physical memory size. Do not set it
too large, though, because competition of the physical memory may
cause paging in the operating system. Note that on 32bit systems you
might be limited to 2-3.5G of user level memory per process, so do not
set it too high.
innodb_buffer_pool_size=4G
Size of each log file in a log group. You should set the combined size
of log files to about 25%-100% of your buffer pool size to avoid
unneeded buffer pool flush activity on log file overwrite. However,
note that a larger logfile size will increase the time needed for the
recovery process.
innodb_log_file_size=1G
Number of threads allowed inside the InnoDB kernel. The optimal value
depends highly on the application, hardware as well as the OS
scheduler properties. A too high value may lead to thread thrashing.
innodb_thread_concurrency=8
The increment size (in MB) for extending the size of an auto-extend InnoDB system tablespace file when it becomes full.
innodb_autoextend_increment=64
The number of regions that the InnoDB buffer pool is divided into.
For systems with buffer pools in the multi-gigabyte range, dividing the buffer pool into separate instances can improve concurrency,
by reducing contention as different threads read and write to cached pages.
innodb_buffer_pool_instances=8
Determines the number of threads that can enter InnoDB concurrently.
innodb_concurrency_tickets=5000
Specifies how long in milliseconds (ms) a block inserted into the old sublist must stay there after its first access before
it can be moved to the new sublist.
innodb_old_blocks_time=1000
It specifies the maximum number of .ibd files that MySQL can keep open at one time. The minimum value is 10.
innodb_open_files=300
When this variable is enabled, InnoDB updates statistics during metadata statements.
innodb_stats_on_metadata=0
When innodb_file_per_table is enabled (the default in 5.6.6 and higher), InnoDB stores the data and indexes for each newly created table
in a separate .ibd file, rather than in the system tablespace.
innodb_file_per_table=1
Use the following list of values: 0 for crc32, 1 for strict_crc32, 2 for innodb, 3 for strict_innodb, 4 for none, 5 for strict_none.
innodb_checksum_algorithm=0
The number of outstanding connection requests MySQL can have.
This option is useful when the main MySQL thread gets many connection requests in a very short time.
It then takes some time (although very little) for the main thread to check the connection and start a new thread.
The back_log value indicates how many requests can be stacked during this short time before MySQL momentarily
stops answering new requests.
You need to increase this only if you expect a large number of connections in a short period of time.
back_log=80
If this is set to a nonzero value, all tables are closed every flush_time seconds to free up resources and
synchronize unflushed data to disk.
This option is best used only on systems with minimal resources.
flush_time=0
The minimum size of the buffer that is used for plain index scans, range index scans, and joins that do not use
indexes and thus perform full table scans.
join_buffer_size=256K
The maximum size of one packet or any generated or intermediate string, or any parameter sent by the
mysql_stmt_send_long_data() C API function.
max_allowed_packet=32M
If more than this many successive connection requests from a host are interrupted without a successful connection,
the server blocks that host from performing further connections.
max_connect_errors=100
Changes the number of file descriptors available to mysqld.
You should try increasing the value of this option if mysqld gives you the error "Too many open files".
open_files_limit=4161
If you see many sort_merge_passes per second in SHOW GLOBAL STATUS output, you can consider increasing the
sort_buffer_size value to speed up ORDER BY or GROUP BY operations that cannot be improved with query optimization
or improved indexing.
sort_buffer_size=256K
The number of table definitions (from .frm files) that can be stored in the definition cache.
If you use a large number of tables, you can create a large table definition cache to speed up opening of tables.
The table definition cache takes less space and does not use file descriptors, unlike the normal table cache.
The minimum and default values are both 400.
table_definition_cache=1400
Specify the maximum size of a row-based binary log event, in bytes.
Rows are grouped into events smaller than this size if possible. The value should be a multiple of 256.
binlog_row_event_max_size=8K
If the value of this variable is greater than 0, a replication slave synchronizes its master.info file to disk.
(using fdatasync()) after every sync_master_info events.
sync_master_info=10000
If the value of this variable is greater than 0, the MySQL server synchronizes its relay log to disk.
(using fdatasync()) after every sync_relay_log writes to the relay log.
sync_relay_log=10000
If the value of this variable is greater than 0, a replication slave synchronizes its relay-log.info file to disk.
(using fdatasync()) after every sync_relay_log_info transactions.
sync_relay_log_info=10000
Load mysql plugins at start."plugin_x ; plugin_y".
plugin_load="mysqlx"
MySQL server's plugin configuration.
loose_mysqlx_port=33060
Given the following table:
CREATE TABLE dbo.T
(
[Description] varchar(8000) NOT NULL,
T1 integer NULL,
T2 integer NULL
);
and data:
INSERT dbo.T
([Description])
VALUES
('| 30 | 30 | INTERNAL AUDIT | RL | OK'),
('| SE+17.5D CYL 1.25'),
('| 10 | 11 | INTERNAL AUDIT | MM | CORRECTED'),
('| 5 | 5 | INTERNAL AUDIT | JY | GOOD'),
('| 56 | 56 | INTERNAL AUDIT | JMS | OK'),
('| 10 | 10 | INTERNAL AUDIT | CN | None'),
('| 3 | 3 | INTERNAL AUDIT | MG | GOOD'),
('| 46 | 47 | INTERNAL AUDIT | AB | None'),
('| 23 | 23 | INTERNAL AUDIT | BA | OK'),
('| 30 | 30 | INTERNAL AUDIT | RL | OK'),
('| 25 | 29 | INTERNAL AUDIT | KV | CORRECTED'),
('| 4 | 3 | INTERNAL AUDIT | KV | PULLING, LIVE AR'),
('| ref # - 0006-4121-02');
The following commented code uses an updatable cursor:
SET XACT_ABORT, NOCOUNT ON;
SET STATISTICS XML OFF;
DECLARE
@Description varchar(8000),
@Pipe1Position integer,
@Pipe2Position integer,
@T1 integer,
@T2 integer;
DECLARE DataCursor CURSOR LOCAL
SCROLL DYNAMIC SCROLL_LOCKS
FOR SELECT [Description] FROM dbo.T
FOR UPDATE OF T1, T2;
OPEN DataCursor;
-- Get the first row
FETCH FIRST FROM DataCursor INTO @Description;
WHILE @@FETCH_STATUS = 0
BEGIN
-- Reset
SET @T1 = NULL;
SET @T2 = NULL;
-- Find the pipe after the first one
SET @Pipe1Position = CHARINDEX('|', @Description, 2);
IF @Pipe1Position > 0
BEGIN TRY
-- Extract the first integer
SET @T1 = CONVERT(integer,
SUBSTRING(@Description, 2, @Pipe1Position - 2));
-- Find the the next pipe
SET @Pipe2Position = CHARINDEX('|', @Description, @Pipe1Position + 1);
IF @Pipe2Position > 0
BEGIN
-- Extract the second integer
SET @T2 = CONVERT(integer,
SUBSTRING(@Description, @Pipe1Position + 1, @Pipe2Position - @Pipe1Position - 1));
END;
END TRY
BEGIN CATCH
END CATCH;
-- Perform the update for this row
UPDATE dbo.T
SET T1 = @T1, T2 = @T2
WHERE CURRENT OF DataCursor;
-- Next row
FETCH NEXT FROM DataCursor INTO @Description;
END;
CLOSE DataCursor;
DEALLOCATE DataCursor;
The final state of the table is:
Description | T1 | T2 |
---|---|---|
| 30 | 30 | INTERNAL AUDIT | RL | OK | 30 | 30 |
| SE+17.5D CYL 1.25 | NULL | NULL |
| 10 | 11 | INTERNAL AUDIT | MM | CORRECTED | 10 | 11 |
| 5 | 5 | INTERNAL AUDIT | JY | GOOD | 5 | 5 |
| 56 | 56 | INTERNAL AUDIT | JMS | OK | 56 | 56 |
| 10 | 10 | INTERNAL AUDIT | CN | None | 10 | 10 |
| 3 | 3 | INTERNAL AUDIT | MG | GOOD | 3 | 3 |
| 46 | 47 | INTERNAL AUDIT | AB | None | 46 | 47 |
| 23 | 23 | INTERNAL AUDIT | BA | OK | 23 | 23 |
| 30 | 30 | INTERNAL AUDIT | RL | OK | 30 | 30 |
| 25 | 29 | INTERNAL AUDIT | KV | CORRECTED | 25 | 29 |
| 4 | 3 | INTERNAL AUDIT | KV | PULLING, LIVE AR | 4 | 3 |
| ref # - 0006-4121-02 | NULL | NULL |
https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=8425325a6e3940eaabb5f69f8004c5b5
The database is not the best place to this kind of work. The pipe-delimited data looks as if it originated in an external system, and was imported from a flat file.
Instead of importing the data as-is then processing inside SQL Server, it would be better to use a dedicated ETL tool like SSIS to transform the flat-file data directly then load into one or more relational database tables.
A cursor won't be ideal if you have tens of millions of rows to process in a hurry, but it does make showing the logic used clear. It performs pretty well all things considered.
You should also look into getting onto a more modern version of SQL Server. Things like STRING_SPLIT
and TRY_CONVERT
make coding this sort of thing a lot easier.
I'd be interested in the number of total rows you have in your Parts.ManufacturingData
table and what the total runtime currently is for your query. Does the number of rows the execution plan says it's returning (roughly 32 million) actually make sense for your query?
Maybe you'll find a filtered index saves you a little bit of time since it'll pre-materialize only the data you want. How about an index with this definition, does it make any difference?
CREATE NONCLUSTERED INDEX IX_ManufacfuringData_MSLID_Filtered ON Parts.ManufacturingData (MSLID) WHERE MSLID IS NOT NULL;
Note you shouldn't need to do anything different to your query to use the above filtered index once it's created. But you should check the execution plan to ensure the optimizer chose this new index over any other indexes on your table when the query runs.
If that doesn't make any difference, the other thing you can try is adding the https://www.mssqltips.com/sqlservertip/3296/why-the-sql-server-forcescan-hint-exists/ hint to your query like so:
select partid,mslid
into ExtractReports.dbo.manufactureparts
from parts.manufacturingdata m with(nolock, FORCESCAN)
where mslid is not null
This would tell the optimizer to use a scan operation instead of seek against your data. This would generally be more performant if roughly a majority of your data in the table meets the criteria of your WHERE
clause. I.e. it's generally faster to scan the entire table at that point and filter out the unwanted rows, than to seek against so many rows. But hard to say if this'll help your circumstances without knowing your data or testing it.
Please note https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-table?view=sql-server-ver15 should be used cautiously and only in circumstances where alternative optimization methods are not possible. Some query hints limit the number of available execution plans that the optimizer can choose from, and therefore can result in an error being thrown when certain queries using those hints try to execute.
In this case, I think using the FORCESCAN
hint is likely ok, as your query is simple, and I don't believe it limits the number of query plans as much as other hints do.
I am trying connect to my postgress database using SQLAlchemy. I was working fine yesterday (01/27/22). Now I get the following error:
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: role "wsb" does not exist
Only the user does exist.
List of roles
Role name | Attributes | Member of
-----------+------------------------------------------------------------+-----------
wsb | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
wsb2 | | {}
As you can see I tried making another user "wsb2", but that didn't work either. Below is my script:
db_string = "postgresql://wsb:wsb@localhost:5432/wsb"
db = create_engine(db_string)
meta = MetaData()
submissions_table = Table('wsb_submissions', meta,
Column('index', INTEGER),
Column('submission_id', TEXT),
Column('title', TEXT),
Column('created', TIMESTAMP),
Column('author', TEXT),
Column('ticker_mentioned', TEXT),
Column('subreddit', TEXT),
Column('emotes', ARRAY(TEXT)),
Column('rocket_emotes', INTEGER)
)
comments_table = Table('wsb_comments', meta,
Column('index', INTEGER),
Column('comment_id', INTEGER),
Column('body', TEXT),
Column('created', TIMESTAMP),
Column('author', TEXT),
Column('rocket_emotes', INTEGER),
Column('subreddit', TEXT),
Column('emotes', ARRAY(TEXT)),
Column('submission_key', TEXT)
)
db.connect()
also here is my Dockerfile:
FROM postgres:13.2
ENV POSTGRES_PASSWORD=wsb
ENV POSTGRES_USER=wsb
ENV POSTGRES_DB=wsb
COPY wsb.sql /docker-entrypoint-initdb.d/
EXPOSE 5432
I have value in table1 like this:
Value
-------
JT, KF, LF
On select in other table how can I get a value like this
Select a.Field from table2
Inner Join Table1 on Table1.field = Table2.field
Where Type IN(Table2.Value)
To do a rolling upgrade of an Availability Group from Windows 2012R2, SQL 2012 to Windows 2019, SQL 2019 do I have to have an intermediary set of servers on Windows 2016? Based on the documentation it sounds like you can only go up one version (verbiage: "You can upgrade to the next version"). The only clustering component is for the Availability Group, no other roles and no services are clustered. I know SQL can go directly to 2019, it is the clustering component of the AG I am unsure about.
The answer depends on the query you want to optimize.
If the query looks like
SELECT ...
FROM ...
WHERE col % 'string'
a GIN index usually works better.
If the query looks like
SELECT ...
FROM ...
ORDER BY col 'string'
LIMIT 1
you need to use a GiST index, because a GIN index cannot support that.