logs archiveChat DB / Freenode / #mysql / 2015 / October / 15 / 1
f00dWorksta
thumbs: delete FROM table1 WHERE id IN (SELECT id FROM table2 WHERE table2 IN (select id from table2 where id2 = 1) GROUP BY id HAVING COUNT(DISTINCT id2) = 1);
thumbs: This runs instantly: SELECT * FROM table1 WHERE id IN (SELECT id FROM table2 WHERE table2 IN (select id from table2 where id2 = 1) GROUP BY id HAVING COUNT(DISTINCT id2) = 1);
thumbs: crap that sql is incorrect, I replaced the names wrongly...
thumbs: corrected SQL: delete FROM table1 WHERE id IN (SELECT id FROM table2 WHERE id IN (select id from table2 where id2 = 1) GROUP BY id HAVING COUNT(DISTINCT id2) = 1);
Nothing4You
hey there
is it technically possible to have 1 column filled with left joins from 2 tables?
Xgc
f00dWorksta: You have an unnecessary subquery.
snoyes
Nothing4You: as in, if the first table doesn't have an entry, fill in with the value from the second table?
Nothing4You
i have 1 table referencing to either table a or table b, do i need to differenciate this on application side or can i create a query that merges the results
yes, exactly
snoyes
use the IFNULL or COALESCE functions
Nothing4You
thanks, i'll look that up
toocwa
hi :) what is the correct command to grant access to mysql from an IP address? would be good if it allowed access to all accounts/databases, but granting access to an existing account would be ok too. Will this do it without doing having to do "IDENTIFIED BY"? GRANT ALL PRIVILEGES ON *.* TO 'user'@'my_ip_address'
i know i can type things in and then test accessing it but dont want to unintentially allow too much with no password or something lol
Xgc
Nothing4You: Replace the inner-most subquery with id2 = 1
snoyes
toocwa: if you haven't already done a CREATE USER, then that won't work
toocwa
snoyes: there is already a user
Xgc
f00dWorksta: Sorry. That was for you.
snoyes
then yes, you do that GRANT and don't need IDENTIFIED BY.
Nothing4You
snoyes: thank you, coalesce works fine
toocwa
perfect ill try it then thanks
Nothing4You
coalesce( staStations.regionID, outpost.regionID ) AS regionID
toocwa
i get nervous about mysql permissions for some reason :P
f00dWorksta
Xgc: not sure what you mean, but I created the subquery and put it into a temp table, and used the temp table for the delete instead and it runs fast now :/
toocwa
so if I do "GRANT ALL PRIVILEGES ON *.* TO 'user'@'my_ip_address'" how would i revoke that exactly?
REVOKE ALL PRIVILEGES, GRANT OPTION FROM 'user'@'ip'; ??
snoyes
yep
toocwa
great thanks
oh I got this ERROR 1045 (28000): Access denied for user 'user'@'%' (using password: YES)
when i try to grant
oh i only have access to specific database i think thats why, so i can't do *.*, duh
snoyes
then you're logged in as a user that doesn't have the grant option, or doesn't have the privileges you are trying to grant
toocwa
ah could be that too
is there a way to check permissions of my current user?
snoyes
SHOW GRANTS; tells you what you have
toocwa
;)
snoyes
you can't grant more than that
green-
doing some query testing with mysqlslap and queries with SQL_NO_CACHE specified, but getting siginficantly different results on the first test (always slower) vs. subsequent tests ... is there some reason the query cache might be getting utilized despite my attempting to do otherwise, or some other explanation for the results?
snoyes
probably not the query cache, but some other buffer
green-
snoyes: hmm, i suppose that's okay for testing purposes. unfortunately, paritioning thus far is not introducing speedier queries to the degree that i'd hoped. neither is the extra 100GB of RAM on this server or the wildly faster CPUs.
does partitioning allow mysql to hold table partitions in the innodb buffer pool individually as opposed to trying to stuff the entire table in?
danblack
!t green planning
ubiquity_bot
green: This is loosely true: For planning - competent hardware, competent mysql server config as to buffers and caches, normalized schema design, correct table storage engine choice, proper data types, adequate indexing, competent queries. For troubleshooting - reverse the order. :-)
danblack
notice partitions isn't mentioned. follow the reverse order
green-
danblack: partitions not mentioned why? surely they offer performance benefits, no?
part of testing the partioning changes is testing index changes as well, as partitioning in this case required important index changes
but anyway, the reason partitioning seems particularly important is that, even with 128GB of RAM in this new box, I still can't fit the entire table into the buffer pool (250GB table)
darwin
partitions offer limitations in addition to benefits.
as do most technical decisions.
green-
darwin: understood, i'm just trying to get to the bottom of how partitioning affects buffer pool usage. I'm operating under the assumption that, given the size of the table in question, that buffer pool is the main limiting factor (at least, that's what I'm trying to test), but I can't seem to find information on whether partitioning changes what the buffer pool stores in any way.
danblack
in short - partitioning has no affect on buffer pool usage.
stop assuming and test.
start with the query. explain, indexes etc...
green-
danblack: i am testing, and not getting the results I expect, so I'm trying to see if I can explain it by the fact that partitioning has no affect on buffer pool usage -- and it seems like from what you are saying, that I can
darwin
partitioning changes one table into multiple tables
so just imagine the table as a set of other tables
green-
that said, if partitioning has no affect on buffer pool usage, then I don't see the point of it at all
darwin
that's how it behaves wrt the buffer pool
green- DROP PARTITION > DELETE * FROM ...
also partitions can prune on read
so you supply the partition key, and you check only 1 partition, instead of the entire table. the individual partitions have less full btrees than the full table would. etc.
green-
well then that seems like it would have an impact -- if i split this table into 10 tables, and the buffer pool can hold a 25GB table instead of a 250GB table, I'd improve efficiency, no?
danblack
how much of a table is checked depends on the query, not where partition boundarys are
darwin
sorry, it'll still hold the 25gb hot set of pages?
it may pollute the buffer pool less via less scanning, in some cases
green-
darwin: right, and it can fit 25gb no problem, but not 250gb
i don't understand why he's assuming i haven't looked at the query
i know what part of the table the query is asking to check, what i'm trying to determine is whether the relevant data and indexes are in memory rather than causing an I/O hit
and whether partionining will help keep the parts of the table that are checked most often (99% of the time, that being the last 10% of rows added) in memory while not keeping parts of the table that are rarely accessed in memory and thus keeping that part of the pool free
at least that's my line of thinking, which could be total bs
darwin
what makes you think it's going to load the entire 250gb table into the buffer pool?
are you insane?
the buffer pool will contain the pages you actually access
no matter if you partition or not
if the table is *very* large, the pages may be slightly inefficiently filled, etc.
but in general they are broadly comparable pages except in extreme cases.
green-
suggesting it will only load what is regularly accessed implies that it does its own internal partioning (in a sense) by only loading those pages, no? if that's true, what's the point in partitioning in the first place?
if it can load only the parts of keys and parts of tables that are regularly accessed, then partioning seems redundant or pointless -- as the schema designer seems unlikely to outwit the engine
meh, i'm spouting garbage for all i know. doing my best to understand this.
darwin
uh... dude.
the buffer pool contains pages, 16kb by default
it is effectively a read-and-write-through cache of these pages
if you read the page (in most cases) it will be added to the buffer pool
if you modify the page, it will be loaded into the buffer pool and then MODIFIED THERE
at some later point, that modified page will be flushed to disk
(if you don't access it for a while)
therefore, in general, pages you read a lot (hot pages) will be in the buffer pool unless your working / hot set is much larger than the buffer pool is
large tables are less efficiently stored in terms of full pages than smaller tables
but other than that, there is not much of a difference.
trust me, partitioning is not pointless
as I've already said, it allows you to prune on the read path
it decreases the size (and thereby improves the performance of) btrees
it allows you to drop chunks of the table in a DDL operation (DROP PARTITION) instead of DML (DELETE TONS OF ROWS)
it is mostly orthogonal to the behavior of the buffer pool you're talking about
but that doesn't make it "pointless"
green-
darwin: thanks, appreciate you bearing with me. your explanation of how the buffer pool works is mostly as I understood it, although I'm still a bit grey on how to further enhance performance on these queries ... as partitioning does not seem to be helping I'll have to look to something else as a next step.
the hardware has helped, a lot. 10 hour queries are down to 4-5 minutes, but I'd like to get quite a bit more, besides the fact that I don't want to rely entirely on hardware as a solution (thus my hopes that partitioning would help)
guess it's time to dig back into indexes, especially given that index changes that came along with partitioning (which essentially moved a standalone index into the PK) don't seem to be altering results either
darwin
green-: what queries? without seeing your schema and etc., no idea.
green-: are you seeing pruning on your queries? use explain extended or whatever
« prev 1 2 next »