Home News Feeds Planet MySQL
Newsfeeds
Planet MySQL
Planet MySQL - http://www.planetmysql.org/

  • Meet The MySQL Experts Podcast: MySQL Thread Pool
    In the latest episode of our “Meet The MySQL Experts” podcast, Mikael Ronstrom, senior MySQL Architect, explains us how the MySQL Thread Pool improves MySQL Scalability. You can try out the MySQL Thread Pool via our MySQL Enterprise Edition Trial. And…MySQL being of Nordic origin, Hyvää Vappua/Glada Vappen to all the Finns and Swedes among us! Enjoy the podcast!

  • Zend CE has a Worm
    After updating the AVGFree virus definitions, I was surprised to find that Zend CE (Community Edition) 4.0.6 had a worm in the JavaServer.exe file. There was greater surprise when Zend CE 5.3.9 (5.6.0-SP1) also had the same worm. This is the message identifying the worm (click on it to see a full size image), and you can read about this particular worm on the Mcafee site: Unless you have the full version of AVG or another security program to try and fix the file, you can only quarantine the file. Quarantine or removal disables Zend CE from working. It begs the questions, “how does Zend release a core file with a worm?” I’ll update this when there’s a fix to this problem.

  • Initial Reactions to MySQL 5.6
    New versions of MySQL are always interesting to try out. Often they have features which I may have asked for myself so it’s satisfying to see them eventually appear on a system I use. Often other new features make life easier for the DBA. Finally we hope overall performance will improve and managing the server(s) will be come easier. So I had a system which needs to make heavy writes, and performance was a problem, even when writing to SSDs. Checkpointing seemed to be the big issue and the ib_logfile size in MySQL 5.5 is limited to 4 GB. That seems a lot, but once MySQL starts to fill these files (and this happens at ~70% of the total I believe),  checkpointing kicks in heavily, and slows things down.  So the main reason for trying out MySQL 5.6 was to see how things performed with larger ib_logfiles. (Yes, MariaDB 5.5 can do this too.) Things improved a lot for my specific workload which was great news, but one thing which I noticed was that initial install / setup time of a system with these large files increased a huge amount. (I was using 2 x 16000M files.)  That was noticeable even on a box with SSDs. On a box with HDs that would be horrible. mysql_install_db now uses innodb to setup some system things and restarts the mysqld binary a couple of times. When you also have a large innodb_buffer_pool (I was trying on a box with 192GB of RAM) the startup of mysqld each time took quite a while. So initial thoughts are that the bootstrapping probably does not need to use the normal innodb_buffer_pool_size setting (but should perhaps overwrite it with the minimal size needed). Managing the size of the ib_logfiles is also something that requires restarting mysqld. Given the sort of experimenting that I was doing it would be nice to be able to dynamically configure this.  I can see many issues with changing the sizes of the existing files, but what seems easier to implement would be to be able to increase or decrease the number of files used (removing the older, higher number files once they are “used”/checkpointed), and that would provide a lot more freedom to the DBA. I clone servers quite a bit and the extra 32 GB of these redo log files is a pain to clone/copy, and time consuming, so being able to reduce the number of files, and checkpoint prior to shutting the server down for cloning, but then re-enabling the normal value afterwards would save a lot of time cloning. So would be a handy feature. I also tried playing with the new GTID features.  They look great. Except for one thing.  This expects master and slaves to have binlogs enabled, so that the GTID values are kept locally, and on startup.  I manage several boxes where for performance reasons the slaves do not have binlogs enabled.  If it dies the slave gets recloned.  Adding back the requirement for binlogs ( and log_slave_updates ) on a slave to enable this feature seems wrong. I can see the need that the different GTID values are kept somewhere, but don’t see the need to keep all binlog information, at least on a slave. Given the ratio of slaves to masters can be quite high that can be an issue.  If necessary write this information somewhere else, why not in an InnoDB table, so that on sever recovery you have consistent information with the rest of the database, something that might not happen with the binlogs… ? Talking of binlogs one new feature I’m using, which I think is good is: binlog_row_image = minimal. This reduces the size of the binlogs and thus I/O.  I’m missing the next step which would be to compress those binlog events, and reduce binlog size further. That may not be an issue on many systems but several servers I manage write over 100 GB of binlogs a day. Reducing this further by being able to compress the binlogs would be nice, and having better control of expire_logs_days too (as the daily granularity can be too large in many cases for me) would help. NOTE: In MySQL 5.6.5 with gtid_mode = ON , mysql_install_db crashes! Bug reported to Oracle, and I guess will get fixed soon. I have seen a few other crashes in 5.6.4 and 5.6.5, I can’t provide links unfortunately as they’re not public. They are still not resolved.  I’d like to make the details of these crashes public as others may come across them, or have come across them before me, but these issues are not visible in http://bugs.mysql.com. Would be nice if they were but requires me duplicating information which is a pain.  Oracle, please make this easier. It is good for us all. So with the little testing I’ve done so far MySQL 5.6 looks good. I need to spend more time with it and investigate the new features, many of which will make life easier. When it finally goes GA I will have to spend a lot more time seeing how it behaves but I believe there are still a few issues which still need resolving prior to that happening.

  • MySQL Enterprise Monitor 2.3.10 Is Now GA!
    We are pleased to announce that MySQL Enterprise Monitor 2.3.10 is now available for download on the My Oracle Support (MOS) web site as our latest GA release. It will also be available via the Oracle Software Delivery Cloud in approximately 1-2 weeks. This is a maintenance release that contains several new features and fixes a number of bugs. You can find more information on the contents of this release in the changelog: http://dev.mysql.com/doc/mysql-monitor/2.3/en/mem-news-2-3-10.html You will find binaries for the new release on My Oracle Support: https://support.oracle.com Choose the "Patches & Updates" tab, and then use the "Product or Family (Advanced Search)" feature. And from the Oracle Software Delivery Cloud (in about 1-2 weeks): http://edelivery.oracle.com/ Choose "MySQL Database" as the Product Pack and you will find the Enterprise Monitor along with other MySQL products. If you haven't looked at 2.3 recently, please do so now and let us know what you think. Thanks and Happy Monitoring! - The MySQL Enterprise Tools Development Team   

  • SchoonerSQL Automates Failure Handling & Failover
    There are different types of failures in the database environment ranging from the loss of the network, to the loss of an instance, all the way to the loss of a node (the server hardware). A robust database is one that can detect such failures and automate the failover and recovery process without any user intervention.SchoonerSQL does exactly that: it detects failures and provides an immediate and automated failover process. Below are the failover scenarios and how SchoonerSQL will handle them.Instance FailureConsider three instances in a synchronous replication group where Node 1 has the master instance, and Node 2 and Node 3 have slave instances.The master has one write virtual IP (10.1.1.2) and one read virtual IP (10.1.1.3); slave instances have read virtual IPs (10.1.1.4, 10.1.1.5) as shown in the diagram below.When a master instance fails due to a crash or a shutdown, the failure handler will immediately detect it and migrate all the virtual IPs from the failed instance to Node 2 & Node 3. The entire process is generally quick with an average failover time of 3 to 5 seconds. Clients automatically connect to the surviving node and continue running the queries. The entire operation is transparent to the end-user.SchoonerSQL also possess self-healing capability with an auto restart when a crash happens. The instance will rejoin the cluster once it catches up and is in sync (virtual IPs will be re-distributed to this instance automatically when it joins the group).  Node Failure The failure handler detects when the entire node crashes or shuts down. Virtual IPs will then be migrated from the failed node to the instances on the remaining live nodes in the cluster. Replication Network FailureConsider two instances, one acting as master with write, read virtual IPs and the other acting as slave with read virtual IP.Consider a scenario where the replication network fails causing network partitioning. The failure handler immediately detects this and shuts down the slave. The virtual IP of the slave is then migrated over to the master as shown in the diagram. To conclude, SchoonerSQL not only detects failures but also automates the failover process thereby providing users a seamless and a transparent experience.

  • Portable Tablespace in InnoDB I test it!
     Overview I have recently blog on the company site about portable Tablespaces   (http://www.pythian.com/news/32547/mysql-bi-weekly-news-04262012/) What I was saying is that is one of the things that could make us, people who work with MySQL/InnoDB happy. This because it is a useful feature for administration and not just a "cool" thing to have. My words were "This is a huge improvement that only people working daily with MySQL/InnoDB can understand, so far it is still in the lab version but we all really hope to have it deliver with the new MySQL 5.6 GA" From there I decide to try it right away, and to also try to extend the test. So what I have done is to take the MySQL version from lab and start to play with tables and tablespaces. Below my results and final considerations. After having downloaded and install the Lab version (root@localhost) [(none)]>status; -------------- /home/mysql/templates/mysql-56p/bin/mysql Ver 14.14 Distrib 5.6.6-labs-april-2012, for linux2.6 (i686) using EditLine wrapper   Connection id: 1 Current database: Current user: root@localhost SSL: Not in use Current pager: stdout Using outfile: '' Using delimiter: ; Server version: 5.6.6-labs-april-2012-log MySQL Community Server (GPL) Protocol version: 10 Connection: Localhost via UNIX socket Server characterset: latin1 Db characterset: latin1 Client characterset: utf8 Conn. characterset: utf8 UNIX socket: /home/mysql/instances/my56testm/mysql.sock Uptime: 21 sec   Threads: 1 Questions: 5 Slow queries: 0 Opens: 16 Flush tables: 1 Open tables: 9 Queries per second avg: 0.238   Real work now and create a schema, tables and feed them.   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 (root@localhost) [(none)]>use test_tablespace1 Database changed (root@localhost) [test_tablespace1]>show tables; Empty set (0.00 sec) (root@localhost) [test_tablespace1]>create table tbtest(a int auto_increment PRIMARY KEY, b char(3) DEFAULT 'AAA', dat TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP); Query OK, 0 rows affected (0.24 sec) (root@localhost) [test_tablespace1]> (root@localhost) [test_tablespace1]>insert into tbtest (b) values ('aaa'),('bbb'),('ccc'),('dddd'); Query OK, 4 rows affected, 1 warning (0.05 sec) Records: 4 Duplicates: 0 Warnings: 1 (root@localhost) [test_tablespace1]>insert into tbtest (b) select b from tbtest; Query OK, 4 rows affected (0.05 sec) Records: 4 Duplicates: 0 Warnings: 0 .... (root@localhost) [test_tablespace1]>insert into tbtest (b) select b from tbtest; Query OK, 4096 rows affected (0.44 sec) Records: 4096 Duplicates: 0 Warnings: 0     Now what we do have on fs ? 1 2 3 4 5 6 rwx------ 2 mysql mysql 4096 2012-04-27 14:42 ./ drwxr-xr-x 4 mysql mysql 4096 2012-04-27 14:28 ../ -rw-rw---- 1 mysql mysql 65 2012-04-27 14:28 db.opt -rw-rw---- 1 mysql mysql 8606 2012-04-27 14:42 tbtest.frm -rw-rw---- 1 mysql mysql 360448 2012-04-27 14:45 tbtest.ibd root@000033:/home/mysql/instances/my56testm/data/test_tablespace1#   Time to RAISE THE COMMAND 1 2 (root@localhost) [test_tablespace1]>FLUSH TABLES tbtest WITH READ LOCK; Query OK, 0 rows affected (0.00 sec)   Check on the file system to see the new cfg file 1 2 3 4 5 6 7 drwx------ 2 mysql mysql 4096 2012-04-27 14:51 ./ drwxr-xr-x 4 mysql mysql 4096 2012-04-27 14:28 ../ -rw-rw---- 1 mysql mysql 65 2012-04-27 14:28 db.opt -rw-rw---- 1 mysql mysql 440 2012-04-27 14:51 tbtest.cfg <--------------IS THERE!!! -rw-rw---- 1 mysql mysql 8606 2012-04-27 14:42 tbtest.frm -rw-rw---- 1 mysql mysql 360448 2012-04-27 14:45 tbtest.ibd root@000033:/home/mysql/instances/my56testm/data/test_tablespace1#   And given I am curious, I read insight: we have there: 1 2 3 4 5 6 7 8 Name of the machine: 000033 Schema/Table name: test_tablespace1/tbtest Table definition: a, b, dat last ROW_ID: DB_ROW_ID Transaction ID: DB_TRX_ID Rollback pointer: DB_ROLL_PTR Primary kye and value : PRIMARY a Last transaction valuea: DB_TRX_ID DB_ROLL_PTR b dat   Now let us copy then remove it and see what happens. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 root@000033:/home/mysql/instances/my56testm/data/test_tablespace1# mkdir -p /home/mysql/backup root@000033:/home/mysql/instances/my56testm/data/test_tablespace1# cp tbtest.* /home/mysql/backup/ root@000033:/home/mysql/instances/my56testm/data/test_tablespace1# ll total 380 drwx------ 2 mysql mysql 4096 2012-04-27 14:59 ./ drwxr-xr-x 4 mysql mysql 4096 2012-04-27 14:28 ../ -rw-rw---- 1 mysql mysql 65 2012-04-27 14:28 db.opt -rw-rw---- 1 mysql mysql 440 2012-04-27 14:51 tbtest.cfg -rw-rw---- 1 mysql mysql 8606 2012-04-27 14:42 tbtest.frm -rw-rw---- 1 mysql mysql 360448 2012-04-27 14:45 tbtest.ibd root@000033:/home/mysql/instances/my56testm/data/test_tablespace1# ll /home/mysql/backup total 376 drwxr-xr-x 2 root root 4096 2012-04-27 14:59 ./ drwxrwxr-- 30 mysql mysql 4096 2012-04-27 14:59 ../ -rw-r----- 1 root root 440 2012-04-27 14:59 tbtest.cfg -rw-r----- 1 root root 8606 2012-04-27 14:59 tbtest.frm -rw-r----- 1 root root 360448 2012-04-27 14:59 tbtest.ibd root@000033:/home/mysql/instances/my56testm/data/test_tablespace     I have the files in the backup dir Now is time to unlock the tables: 1 UNLOCK TABLES;   And check what happened: 1 2 3 4 5 6 drwx------ 2 mysql mysql 4096 2012-04-27 15:01 ./ drwxr-xr-x 4 mysql mysql 4096 2012-04-27 14:28 ../ -rw-rw---- 1 mysql mysql 65 2012-04-27 14:28 db.opt -rw-rw---- 1 mysql mysql 8606 2012-04-27 14:42 tbtest.frm -rw-rw---- 1 mysql mysql 360448 2012-04-27 14:45 tbtest.ibd     No more cfg file. Copy the table to something else and Drop the table: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 (root@localhost) [test_tablespace1]>create table tbtest2 select * from tbtest; Query OK, 8192 rows affected (1.02 sec) Records: 8192 Duplicates: 0 Warnings: 0 (root@localhost) [test_tablespace1]> Drop tbtest: (root@localhost) [test_tablespace1]>drop table tbtest; Query OK, 0 rows affected (0.08 sec) (root@localhost) [test_tablespace1]>show tables; +----------------------------+ | Tables_in_test_tablespace1 | +----------------------------+ | tbtest2 | +----------------------------+ 1 row in set (0.00 sec)     Create a fake table tbtest 1 2 3 4 5 6 7 8 9 10 11 12 13 (root@localhost) [test_tablespace1]>create table tbtest(a char(1)); Query OK, 0 rows affected (0.23 sec) (root@localhost) [test_tablespace1]>ALTER TABLE tbtest DISCARD TABLESPACE; Query OK, 0 rows affected (0.04 sec) (root@localhost) [test_tablespace1]> drwx------ 2 mysql mysql 4096 2012-04-27 15:05 ./ drwxr-xr-x 4 mysql mysql 4096 2012-04-27 14:28 ../ -rw-rw---- 1 mysql mysql 65 2012-04-27 14:28 db.opt -rw-rw---- 1 mysql mysql 8606 2012-04-27 15:02 tbtest2.frm -rw-rw---- 1 mysql mysql 425984 2012-04-27 15:02 tbtest2.ibd -rw-rw---- 1 mysql mysql 8554 2012-04-27 15:04 tbtest.frm -rw-rw---- 1 mysql mysql 98304 2012-04-27 15:04 tbtest.ibt <----------- DETACHED tablespace     COPY back the files: 1 2 3 4 5 6 7 8 9 10 11 12 root@000033:/home/mysql/instances/my56testm/data/test_tablespace1# cp /home/mysql/backup/* . root@000033:/home/mysql/instances/my56testm/data/test_tablespace1# ll total 904 drwx------ 2 mysql mysql 4096 2012-04-27 15:06 ./ drwxr-xr-x 4 mysql mysql 4096 2012-04-27 14:28 ../ -rw-rw---- 1 mysql mysql 65 2012-04-27 14:28 db.opt -rw-rw---- 1 mysql mysql 8606 2012-04-27 15:02 tbtest2.frm -rw-rw---- 1 mysql mysql 425984 2012-04-27 15:02 tbtest2.ibd -rw-r----- 1 root root 440 2012-04-27 15:06 tbtest.cfg -rw-rw---- 1 mysql mysql 8606 2012-04-27 15:06 tbtest.frm -rw-r----- 1 root root 360448 2012-04-27 15:06 tbtest.ibd -rw-rw---- 1 mysql mysql 98304 2012-04-27 15:04 tbtest.ibt   Fix permissions 1 2 3 4 5 6 7 8 9 10 11 12 root@000033:/home/mysql/instances/my56testm/data/test_tablespace1# chown mysql:mysql tbtest.* root@000033:/home/mysql/instances/my56testm/data/test_tablespace1# ll total 904 drwx------ 2 mysql mysql 4096 2012-04-27 15:06 ./ drwxr-xr-x 4 mysql mysql 4096 2012-04-27 14:28 ../ -rw-rw---- 1 mysql mysql 65 2012-04-27 14:28 db.opt -rw-rw---- 1 mysql mysql 8606 2012-04-27 15:02 tbtest2.frm -rw-rw---- 1 mysql mysql 425984 2012-04-27 15:02 tbtest2.ibd -rw-r----- 1 mysql mysql 440 2012-04-27 15:06 tbtest.cfg -rw-rw---- 1 mysql mysql 8606 2012-04-27 15:06 tbtest.frm -rw-r----- 1 mysql mysql 360448 2012-04-27 15:06 tbtest.ibd -rw-rw---- 1 mysql mysql 98304 2012-04-27 15:04 tbtest.ibt   Remove fake table space and import back the old one 1 2 3 root@000033:/home/mysql/instances/my56testm/data/test_tablespace1# rm -f tbtest.ibt (root@localhost) [test_tablespace1]>ALTER TABLE tbtest IMPORT TABLESPACE; ERROR 1801 (HY000): InnoDB: Number of columns don't match, table has 4 columns but the tablespace meta-data file has 6 columns   Error, well that was easy to understand, my fault I was suppose to create a fake table with the same structure not different. Let me repeat the process of the fake table. I did: attach the ibt table space (I had it saved in backup) drop it recreate table as for initialal structure detach it again. copy back the old idb file and cfg 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (root@localhost) [test_tablespace1]>ALTER TABLE tbtest IMPORT TABLESPACE; Query OK, 0 rows affected (2.93 sec) (root@localhost) [test_tablespace1]>check table tbtest; +-------------------------+-------+----------+----------+ | Table | Op | Msg_type | Msg_text | +-------------------------+-------+----------+----------+ | test_tablespace1.tbtest | check | status | OK | +-------------------------+-------+----------+----------+ 1 row in set (0.02 sec) (root@localhost) [test_tablespace1]>select count(*) from tbtest; +----------+ | count(*) | +----------+ | 8192 | +----------+ 1 row in set (0.00 sec)   WOW it works that could make me happy but given I am never happy (enough) Let me try a crazy thing. Create a fake table tbtest3 change the info in the cfg file and the table space filename try to import it. Let's go ... First copy and modify the cfg file: 1 2 3 root@000033:/home/mysql/backup# cp tbtest.ibd tbtest3.ibd root@000033:/home/mysql/backup# cp tbtest.cfg tbtest3.cfg root@000033:/home/mysql/backup# vi tbtest3.cfg   Try to attach it: 1 2 (root@localhost) [test_tablespace1]>ALTER TABLE tbtest3 IMPORT TABLESPACE; ERROR 1712 (HY000): InnoDB: While reading table name: 'I/O error'.   No luck, it would have being to cool and easy. Ok so Conclusions ==================== 1) we can now do export - import of table spaces a little bit more easily 2) we cannot move tablespace cross schemas 3) we cannot attach a tablespace to another table I understand that would be too cool and we must accept what we have, and it would be also nice to take a look at the code. The full procedure in a thumb ======================================== Assuming you already have the table in place take the table creation to replicate the structureSHOW CREATE TABLE tbtest\G Lock the table to copy itFLUSH TABLE tbtest WITH READ LOCK; Copy somewhere the files DON'T forget the .cnf UNLOCK TABLES; Drop the table Create a fake table using the create statement stored before Detach the table ALTER TABLE tbtest DISCARD TABLESPACE; MOVE !!! the *.ibt file in a safe place Copy over the previous files from the backup directory CHECK PERMISSION!! Import back the table space ALTER TABLE tbtest IMPORT TABLESPACE; check table;   {joscommentenable}  

  • Mozilla DB News, Fri Apr 27th
    It has been 4 weeks since I last posted the goings-on for Mozilla DBs. April is always a crazy month because of the annual MySQL conference (Some great pics here). This year it was the Percona Live: MySQL Conference and Expo. And of course as soon as I get caught up from the conference, I have to submit more sessions to MySQL Connect (call for papers closes Sunday May 6th) and Percona Live: NYC (anyone know when the call for papers for this will close?). At the conference, I gave a lightning talk and a tutorial. I have posted the slides for those interested. Unfortunately, Percona asked me not to post the recordings of tutorials. I had cleared recording with them, but apparently during the last tutorial session one of the 2 video cameras was turned off and I was informed after the fact that I was not supposed to be recording them. Which is odd, since O’Reilly had no problems with me recording tutorials in 2008 (memcached tutorial), 2009 (part 1 and part 2 of a metadata tutorial) and 2010 (part 1>/A> and part 2 of a tutorial about config options). I have no videos from 2011 since I did not attend the conference. At any rate, this is just to let you know that due to Percona’s policy, there will not be any tutorial recordings available (should they decide to change their policy, the tapes have not been recorded over yet). The session recordings are forthcoming, I will blog here when they are ready for viewing. We have also been busy with data center moves! Due to MySQL’s flexibile architecure, we were able to move db services with very little interruption to the end user. This is a big deal because we have to change monitoring checks and network flows as well as MySQL ACLs (database authorization – Mozilla is very good about only allowing specific hosts or groups of hosts to connect to MySQL!) In the last 4 weeks, we have done many moves: We not only moved the database cluster that buildbot was on, but we also built a cluster specifically for buildbot, with no other services on it. Before, it shared a database with other services such as graphs, cruncher and autoland. Upgraded a production slave of Bugzilla to MySQL 5.1 and put it in the mix. It has been running for a few weeks without any problems, so I will be upgrading the other slaves as soon as I have time. Helped debug why https://reps.mozilla.org/people/ was slow. It was not a database issue. (I have to say, after doing 5 years of consulting, where problems are usually the database, it’s nice to work in a shop where the problem usually is NOT the database!) Started to turn an old mail server, which had lots of space, into a backup server in our Phoenix data center. Added some metadata for http://graphs.mozilla.org. Moved our PHPMyAdmin server. It moved, and nobody noticed, so either I did a good job, or nobody’s really using it! Added new database grants so our metrics team could access new customBugzilla fields. Moved our metrics databases. Moved the web development databases. Move our internal inventory database. This is tricky, because we relied heavily on our inventory database during our move, so we had to be extra careful that we did not cause any problems with the move…while we were moving this server! Of course that was not a huge issue, as we had a database in a third data center that took all the traffic and became the active database cluster as we moved the original cluster (from San Jose to Santa Clara). There was plenty of PostgreSQL work to do as well. I learned how to refresh our stage database from production and also got a lot more practice in reprocessing crash reports for our Crash Stats database. Did a test-run of moving the database behind a Mozilla wiki, and mentored the Web Operations team on how to do it. They did it successfully, which meant I did not have to be up in the wee hours of the morning! Got a sanitized copy of the Bugzilla database to some developers, a few times. Created the Mozilla Labs database cluster. Created the Bedrock database cluster – on Percona Server 5.5! I am very excited to be using a 5.5 version on new projects. One of the goals I am making slow progress on is upgrading servers from MySQL 5.0. Created the new developer database cluster. Created the Personas database cluster. Decommissioned old http://support.mozilla.com database servers. Decommissioned old http://addons.mozilla.com database servers. Decommissioned old Bugzilla staging database servers. Created a new Bugzilla staging database cluster. Debugged a Hive problem – For reference, the problem was getting this error: ERROR metadata.Hive: javax.jdo.JDOException: Couldnt obtain a new sequence (unique id) : Binary logging not possible. Message: Transaction level 'READ-COMMITTED' in InnoDB is not safe for binlog mode 'STATEMENT' The problem is that Hive automatically sets the transaction isolation level to be READ_COMMITTED by default. The solution is to change Hive’s configuration settings to set the transaction isolation level to REPEATABLE_READ, to match MySQL’s default. It has been a crazy few weeks, and will only get crazier as I will be traveling more and more!

  • Fun with Cluster and Locking
    I've been dealing with MySQL Cluster in one way or another since around 2005 or so (back in the MySQL 4.1 days) but it is still full of "funny" surprises. This post is a collection of different locking related issue i ran into during the previous weeks that i had not been aware of up to now (or simply may have forgotten over time) == Unique hash indexes lock exclusively == This is the one that regular users are most likely to run into: in general row logs in MySQL Cluster distinguish between reads and writes so that writers can block other writers, but not readers and readers from other transactions always see the last committed row value (Cluster currently only supports the READ COMMITTED isolation level). As soon as you have a secondary unique index in addition to a primary key things are different though. Internally a unique index that is not the primary key is implemented as a unique hash index in Cluster (and optionally also as an additional ordered T-Tree), and for unique hash entries there only seems to be exclusive locking so that writers will block readers across transactions. See http://bugs.mysql.com/65086 for actual examples. == Long running transactions can block starting nodes == Less likely to happen, but very annoying if it happens and you don't know about it: A starting node needs to lock all rows for a very short period of time near the end of start phase 5. At this point it needs to wait for all current transaction to free the locks they are holding (by either COMMIT or ROLLBACK). This is usually not a problem as cluster transaction are not supposed to last very long anyway, but if there happens to be a long running transaction it can potentially block a node start forever. A stopping node will wait for pending transactions for 5 seconds and will then terminate these transactions the hard way, a starting node on the other hand will gracefully wait forever. This is bad in two ways: 1) it is obviously bad availability wise as you may end up with a non-redundant configuration for extended periods of time and without any way to automatically identify and terminate the offending transaction(s) 2) currently the blocked starting node does not even tell what it is waiting for, it just silently sits and waits. So unless you know what is going on you are faced with a node that is simply stuck without doing anything (CPU, disc and network load next to zero) The related bug report is http://bugs.mysql.com/65037 , proposed solutions are meaningful log messages in the short term and killing active transactions after a grace period in the long run. == Weird error message on lock timeout == This last item is related to the non-standard INSERT ... ON DUPLICATE KEY UPDATE construct. What can happen here is that the INSERT part fails with a lock wait timeout as the key is write locked by another transaction. The expected error message in this case would simply be "Lock wait timeout exceed" but what you'll actually get is "Transaction already aborted". What seems to happen here is that the ON DUPLICATE KEY UPDATE part is triggered by any error in the INSERT phase and not only by duplicate key violation. So the INSERT part is tried tried, fails with "Lock wait timeout" (which implicitly rolls back the current transaction), next the UPDATE part is tried, fails with "Transaction already aborted" and overwrites the previous error ... The related bug report is http://bugs.mysql.com/65130

  • Some fun around history list
    Why this article? First of all because I was having fun in digging in the code. Then I was reading a lot about the improvements we will have in MySQL 5.6, and of some already present in 5.5. Most of them are well cover by people for sure more expert then me, so I read and read, but after a while I start to be also curious, and I start to read the code, and do tests. I start to do comparison between versions, like 5.1 - 5.5. - 5.6 One of the things I was looking to was how the new Purge thread mechanism works and his implications. I have to say that it seems working better then the previous versions, and the Dimitry blog (see reference) seems confirm that. So again why the article? Because I think there are some traps here and there and I feel the need to write about them. The worse behaviour is when using MySQL 5.5, and this is because in 5.5 we have an intermediate situation, where the purge is not fully rewrite as in 5.6, but also not bound to the main thread.   What is the history list? MySQL uses (from Innodb 1.1 MySQL 5.5) 1 to 128 Rollback segments, each able to support 1023 transactions. Each transaction is assigned to one of the rollback segments, and remains tied to that rollback segment for the duration. This enhancement improves both scalability (higher number of concurrent transactions) and performance (less contention when different transactions access the rollback segments). History list is tightly bound to undo log representing the number of Header Pages related to undo log segments, segments that are related to finalize transactions,commit or roll back. That's it, History list represent the number of not yet flush segments in the undo log.   Old and New Previously, the purge thread was directly controlled by the main thread in InnoDB causing serious side effects for description of which read ( http://dimitrik.free.fr/blog/archives/2010/04/mysql-performance-why-purge-thread-in-innodb.html). The main change was to move out the purge thread and allow it to run it isolated, such that it will not impact other process. The move was done in Innodb 1.1 (present in 5.1 and 5.5) But that is not all, the new version of purge has a mechanism that allow it to be more or less aggressive, in relation to the increase or decrease of the History List length. This option is enable in the 5.6 release and is taking the innodb_purge_threads as the "up to value" it can use to increase the number of threads for purging.   Different behaviour What is quite obvious is that the behaviour of the new Purge mechanism, is quite different from the previous one, ie 5.0 (innoDB 1.0) or 5.1. In the previous versions of InnoDB, we were educated to consider the History List something that should always be close to 0, not always but as soon as possible. Frankly speaking that was always a little bit confuse, but as said the MANUAL, was suggesting in several place to keep it between reduced numbers: I.e. (http://dev.mysql.com/doc/refman/5.5/en/innodb-multi-versioning.html) "If you insert and delete rows in smallish batches at about the same rate in the table, the purge thread can start to lag behind and the table can grow bigger and bigger because of all the “dead” rows, making everything disk-bound and very slow In such a case, throttle new row operations, and allocate more resources to the purge thread by tuning the innodb_max_purge_lag system variable."   A classic scenario for such issue, is an application taking track of the activity on the network, that require to write huge number of small insert into the database.   From my tests I have seen an incredible number of entry in the history list in 5.5, that were not present in the previous InnoDB version, and that are not present again in 5.6. The point is it could happen to have so many transactions, doing INSERT,UPDATE or DELETE that the History grows too much, and the un-flushed undo log as well. To prevent issues, we should tune the value of the innodb_max_purge_lag in order to allow InnoDB to complete the PURGE operations.   Innodb_max_purge_lag is the maximum number in history list we want to have, above which Innodb will start to apply an indiscriminate delay in pushing the operations. the formula is quiet simple: 1 ((purge_lag/innodb_max_purge_lag)×10)–5 milliseconds.   Or following the code we have:   float ratio = (float) trx_sys->rseg_history_len/ srv_max_purge_lag; ((ratio - .5) * 10000);       If we have a History list of 1200 and we have set innodb_max_purge_lag to 1000 result should be:   1 2 ((1200/1000)X10)-5= 7 ms delay for operation. following the manual ((1200/100) -5) * 10000 = 7000 <-- Microsecond following the code     All match and delay will be 7 ms.     Also the max limit in previous version for the delay was of 4295 seconds!! While in 5.5/5.6 we see a different way of managing the max number of seconds: 1 2 3 4 if (delay > srv_max_purge_lag_delay) { delay = srv_max_purge_lag_delay; }   Where srv_max_purge_lag_delay max value is 10 seconds. So the max delay, the worse case will be 10 seconds.   In the past as said we were use to see the History list going up and down (relatively) fast, so the force delay was playing his role efficiently. At the same time, we knew that all operations in the Main threads where slowed down, so the forced delay, was something we had to leave with, or worse things could happen, like the need to perform an emergency page flush from the buffer pool, to allow REDO to recover space.   But something has changed... ... in better obviously... but we must be careful.   Better because now the purge thread works independently, and that it could scale, pending undo flushes do not slow down the other operations. Also in 5.6, MySQL could be more or less aggressive in relation to the History list to purge.   Those operation remain something we should  monitor and tune, for two main reasons: - space taken by undo log segments is till an issue, and now that the number is increase, it could be even worse. - Setting a wrong value for innodb_max_purge_lag could kill our performance.     Let us start digging a bit. First of all History list and purge log thread are still very bad documented in the InnoDB Status Report. In 5.5 we can count on the History list information, number of transaction purge has being up to,  then the number of purged record up to, finally in 5.6 we have the generic state. Not too much here, better info like number or running threads, real segments used (1-128), number of Header Pages in the segments, and dimension (at least pages as Percona does) would be nice to have.   Undo log is currently stored inside the system tablespace, this means that IT WILL NOT BE POSSIBLE to shrink the size of the table space, once undo log have taken huge amount of space. That's it, the 80% or more of the size of a system table space is because the undo log, when using innodb_file_per_table. and this was already true when InnoDB was using a single set of segments (1023), now that it can scale up to 130944, and that it supports better more transactions, the size on disk can explode.   Some numbers to have better understanding, History list 359585 insert/s 35649.67 pages to flush in the undo log 429126 Means in the undo log a total size of ~ 6.7GB       Considering that normal block size is 4K in file system each page is 4 operation, we will have 1,716,504 operation, assuming that each page will be sequential, this means 3 ms for Seek+half rotation then, 1 ms transfer for the first operation then 2ms for the remaining will be 12ms for each page delete on disk.   Meaning 5149.512 seconds (85 minutes)  at 6.7 Mb/s given the partial random write, to flush the whole.   Obviously this number changes in respect of the available cache and available spindles. Also more threads more capacity in write, less time, so the option innodb_purge_threads is more then welcome.   Setting the right value for innodb_max_purge_lag In setting this value we must keep into consideration the following: - the value is the number of head pages representing an undo log relation to a running/executed transaction. - Purge can apply only to complete transaction - delay apply to all write operation inside InnoDB   Nowadays is not uncommon to have high number in history list in 5.5, in this case "just" 359,585 head pages, is then very important to correctly balance the delay point of start with the real load, like transactions/sec and the kind of operations are ongoing on the server.   To clarify, the relevance also in case of reads, not only of writes, let me cite: <snip> Running Very Long Transaction If you’re running very long transaction, be it even SELECT, Innodb will be unable to purge records for changes which are done after this transaction has started, in default REPEATABLE-READ isolation mode. This means very long transactions are very bad causing a lot of garbage to be accommodated in the database. It is not limited to undo slots. When we’re speaking about Long Transactions the time is a bad measure. Having transaction in read only database open for weeks does no harm, however if database has very high update rate, say 10K+ rows are modified every second even 5 minute transaction may be considered long as it will be enough to accumulate about 3 million of row changes. <snip> (Peter Z)   But what can really harm your the system is the dealy define by the purge lag to improve the flushing. Assume we define it to innodb_max_purge_lag=200,000, and we do have the number of pending flush as for the above 359585. Doing calculation as for the previous formula   ((359585/200000)X10)-5= 12.97925 ms delay for operation       Hey that's not too bad, I will delay only 12.97925 ms to operation/Insert to help the purge. But what is not clear is what an operation is for the delay, or more correctly where do the delay really apply?   Ready? Are you sit comfortable? 2) row0mysql.c   1 2 3 4 5 6 7 8 9 10 11 12 /*******************************************************************//** Delays an INSERT, DELETE or UPDATE operation if the purge is lagging. */ static void row_mysql_delay_if_needed(void) /*===========================*/ { if (srv_dml_needed_delay) { os_thread_sleep(srv_dml_needed_delay); } }       3)os0thread.c   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 /*****************************************************************//** The thread sleeps at least the time given in microseconds. */ UNIV_INTERN void os_thread_sleep( /*============*/ ulint tm) /*!< in: time in microseconds */ { #ifdef __WIN__ Sleep((DWORD) tm / 1000); #else struct timeval t; t.tv_sec = tm / 1000000; t.tv_usec = tm % 1000000; select(0, NULL, NULL, NULL, &t); #endif }     Do you get it? delay is per ROW.   So assume you have a system checking connections status and traffic, collecting statistics every minute for your 100000 connected users, each usert generate at least 8 insert, plus a consolidation operation to get average 10 minutes each insert. Each insert per user taking, 0.002 second. All traffic manage by 300 threads.     100000 x 8 = 800000/300 = 2666 insert for transactions x 0.002ms each = 5.3sec to run all the operations.     Now what if we apply the delay given we have History list quite above as say before?   I have to sum the 12ms to the 0.002 which will give us 0.122 sec, which means 325 seconds (5.4 minutes) for each thread!!!! Do not forget the consolidation process, who needs to run each 10 minutes. So it has to process (1 reads for user per minute x 100000 users) x 10 minutes, split in 10 threads doing it by user id set, assuming each read per user will take 0.001 ms (given already in memory) and 0.002 ms for write. Without delay we will have = 1,000,000 / 10 = 100,000 x 0.003 = 300sec (5 minutes) for thread. With delay it will be for each operation 0.122 = 12200 seconds (203 minutes).   Last but not least the consolidation will collide with the inserts, causing possible increase of the delay because the REPEATBLE_READ, and another possible issue is... the consolidation will cause pages to remain in a dirty state for too long, possibly causing serious issue in the REDO log in case of need to free space.   I did push some number a little bit but not too much and only to make the scenario more clear.   Now just to remove some possible doubt:   Are you sure that it really push it by ROW? let us create a working test.     1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 CREATE TABLE `City_test2` ( `ID` int(11) NOT NULL DEFAULT '0', `Name` char(35) NOT NULL DEFAULT '', `CountryCode` char(3) NOT NULL DEFAULT '', `District` char(20) NOT NULL DEFAULT '', `Population` int(11) NOT NULL DEFAULT '0', `satisfaction_grade` char(50) DEFAULT NULL, `previous_satgrade` varchar(50) DEFAULT NULL, KEY `satisfaction_grade` (`satisfaction_grade`,`previous_satgrade`) ) ENGINE=InnoDB DEFAULT CHARSET=latin1   insert into City_test2 select * from city; mysql> truncate table City_test2;insert into City_test2 select * from city limit 1000; Query OK, 0 rows affected (0.24 sec) Query OK, 1000 rows affected (0.10 sec) Records: 1000 Duplicates: 0 Warnings: 0       So running the query we see the History going up and down but never to 0 as before. but changing the innodb_max_purge_lag and setting it to a lower value then the history say history is 137, set the lag to 100 in theory there should be the following delay as for instructions: ((purge_lag/innodb_max_purge_lag)×10)–5 milliseconds. result should be   ((137/100)X10)-5=8.7 ms         1 2 3 4 5 6 7 8 mysql> set global innodb_max_purge_lag=100; Query OK, 0 rows affected (0.00 sec)   mysql> truncate table City_test2;insert into City_test2 select * from city limit 1000; Query OK, 0 rows affected (0.10 sec) Query OK, 1000 rows affected (7.40 sec) Records: 1000 Duplicates: 0 Warnings: 0       Close enough from my point of view, and it increase at the increasing of the number of rows. Changing : innodb_purge_batch_size innodb_rollback_segments innodb_purge_threads   Will change how the purge will work but not the delay, innodb_max_purge_lag is the only possible parameter to use, and is not enough.   Conclusions The new mechanism for the purge is much more flexible, and capable to scale. The fact that is now separate from the main thread reduce a lot the negative effects. Never the less the capacity that the undo log purge have now, has also possible risks, huge pending flushes means possible huge space on disk and/or huge delay. The delay behaviour is not new and it was already present MySQL 5.0, but the new capacity bring it to a higher level of risk, specially in 5.5. In 5.6 the purging thread is much more efficient and I was having really hard time to do get huge history list, but when I get it I had same behaviour. Whenever we need to tune the purge lag settings, the value needs to be set not in relation to the history list, but in relation to the maximum acceptable delay in the inserts as cumulative value.   References http://www.mysqlperformanceblog.com/2011/01/12/innodb-undo-segment-siz-and-transaction-isolation/ http://www.mysqlperformanceblog.com/2010/06/10/reasons-for-run-away-main-innodb-tablespace/ http://dimitrik.free.fr/blog/archives/2010/04/mysql-performance-why-purge-thread-in-innodb.html http://www.mysqlperformanceblog.com/2011/06/09/aligning-io-on-a-hard-disk-raid-the-benchmarks/ http://blogs.innodb.com/wp/2011/04/mysql-5-6-multi-threaded-purge/ http://mysqldump.azundris.com/feeds/categories/1-Strangeness.rss   {joscommentenable}

  • PECL/mysqlnd_ms 1.4 = charset pitfalls solved
    Tweaking is the motto - what an easy release PECL/mysqlnd_ms 1.4 will be! The first tweak for the next stable version of the mysqlnd replication and load balancing plugin solves pitfalls around charsets. String escaping now works on lazy connection handles (default) prior to establishing a connection to MySQL. A new server_charset setting has been introduced for this. The way it works also prevents you from the risk of using a different charset for escaping than used later on for your connection. Lazy connections and server_charset PECL/mysqlnd_ms is a load balancer. A users connection handle can point to different nodes of a replication cluster over time. For example, if using MySQL Replication, the connection handle may point to the master for running writes and, later on, to one of the slaves for reads. At the very moment a user opens a connection handle, the load balancer does not yet know which cluster node needs to be queried first. /* Load balanced following "myapp" section rules from the plugins config file */ $mysqli = new mysqli("myapp", "username", "password", "database"); $pdo = new PDO('mysql:host=myapp;dbname=database', 'username', 'password'); $mysql = mysql_connect("myapp", "username", "password"); Thus, the plugin delays opening a MySQL connection to any configured cluster node (master or slave) until a node has been selected for statement execution. The plugin calls this lazy connections. Instead of openening a connection to a default node or even opening connections to all nodes, it waits and sees. /* Nothing but a handle, no MySQL connection opened so far*/ $mysqli = new mysqli("myapp", "username", "password", "database"); /* Escaping on a lazy connection is not possible - will emit a warning */ $mysqli->real_escape("What charset to use?"); /* A node is picked and a connection will be openend */ $mysqli->query("SELECT 'connection will be opened now' AS _msg FROM DUAL"); While lazy connections potentially help to keep the number of connections opened as low as possible, there is a problem. Which charset to use for string escaping prior to opening a connection to MySQL? PECL/mysqlnd_ms 1.4 will search for a server_charset setting in its configuration file and use it. If it is not there, it will bark at you, pretty much like all previous stable releases. A warning will be thrown that reads like (mysqlnd_ms) string escaping doesn't work without established connection in the current series and is a bit more verbose in the 1.4 series, like (mysqlnd_ms) string escaping doesn't work without established connection. Possible solution is to add server_charset to your configuration Setting server_charset removes the warning. PECL/mysqlnd_ms 1.4 will use the server_charset to do the string escaping. At any time thereafter the user can change the charset using an API call for string escaping with a different charset. SQL statements shall not be used to change charsets as they are not monitored by the plugin. However, upon establishing a connection to any MySQL server, the connection will be set to server_charset again. No need to take care of the server configuration Enforcing the configured server_charset whenever a connection is opened free’s the administrator from the need to set the same default charset on all servers. When using version 1.3 or older you should make sure that servers use the same charset. This prevents tapping into the pitfall of escaping a string using the charset of the first server contacted, then switching the connection to another server/connection with a different charset and accidently using a wrongly escaped string. More tweaks and improvements considered for version 1.4 are listed at http://wiki.php.net/pecl/mysqlnd_ms. Please, do not understand the list as a firm promise that everything will be implemented. Feel free to add ideas or tasks to the wiki page. Happy hacking! @Ulf_Wendel 

  • OurSQL Episode 88: MySQL Founders
    This week we listen to MySQL founders David Axmark and Michael "Monty" Widenius talk about all sorts of topics, from the MySQL ecosystem to "Crazy Monty", an open source restaurant. David and Monty don't always agree, so this panel is enlightening and also funny. News/Events/Feedback MariaDB 5.5 features - 5.5 is now GA! IOUG podcast about MySQL Connect and the new Development Milestone Release (DMR) of MySQL 5.6. read more

  • MySQL Cluster: Disk Data Config
    If you are using the Disk Data feature of MySQL Cluster you may find this useful. As you may know the default way of storing data in MySQL Cluster is in-memory. However, if you have tables that don't require realtime access and/or are growing larger than your available memory then you can use disk data tables. I have written a bunch of posts about disk data tables before:http://johanandersson.blogspot.se/2008/06/disk-data-tables.htmlhttp://johanandersson.blogspot.se/2008/11/disk-data-extent-deallocation-caveats.htmlhttp://johanandersson.blogspot.se/2008/12/disk-data-summary.htmlhttp://johanandersson.blogspot.se/2010/11/diskpagebuffermemory-tuning-and-disk.htmlHowever, it is quite hard to configure Disk Data tables correctly. There are some parameters to keep an eye on:UNDO LOG - MySQL Cluster undo logs every write access to the disk data tables. You can grow this resource online (ALTER .. ADD ..)REDO LOG - as for in-memory tables  every write is REDO logged. You can grow the REDO log with a  Rolling restart (NoOfFragmentLogFiles). The REDO LOG is shared between the disk data tables and the in-memory tables.REDO BUFFER - specified as RedoBuffer=32M or higher in config.ini.  The REDO BUFFER is shared between the disk data tables and the in-memory tables.UNDO BUFFER -  created from SharedGlobalMemory resource and is set at LOGFILE GROUP creation. WARNING!! Can not be changed at runtime! The LOGFILE GROUP must be recreated, and it requires that the table space (and all its data files) is dropped before.TABLESPACE - can be extended at runtime.  ALTER .. ADD..DiskPageBufferMemory - like the innodb_buffer_pool. The bigger the better. Can be tuned with a Rolling Restart.For stability the most common mistakes are to set UNDO LOG and UNDO BUFFER too small.What can happen then?Stability and RecoveryIf UNDO BUFFER is too small symptoms are:Failed writes to the Disk Data tables. In worst case you can hit crashing bugs.If UNDO LOG is too small symptoms are:Failed writes to the Disk Data tables. In worst case you can hit crashing bugs. Node recovery may not be possible - the recovering node will crash. Severalnines has assisted a number of client to perform  a number of recovery when this has been the case.If there is a Cluster crash due to a too small UNDO LOG you will most likely have to restore from backup.How can it be possible for MySQL Cluster to fail to recover the node, when it was possible to load it with data?The problem is that your write pattern when writing data at runtime into MySQL Cluster is different from the write pattern at recovery.At recovery the recovering data node will copy data (over the network) from the other node in the node group. It will do this as fast as it can, table by table.Thus, it will be a lot of data coming into the recovering data node (like a bulk write) and in this case then the UNDO LOG may be too small to handle the write log (you can argue why UNDO and REDO log while doing recovery but but..). If the UNDO LOG is over run, then the node will crash.ConfigurationAs usual this depends a lot  on the access patterns you have so it is hard to give exact numbers but if you rely on disk data tables you should have:[NDBD DEFAULT] SharedGlobalMemory=1024MDisk operation records and other meta data is allocated from this buffer. The more records you have in the disk data tables the bigger this buffer must be. 1024M is a good start.[NDBD DEFAULT] DiskPageBufferMemory=4096MAs large as you can (remember that in-memory tables, and for disk data tables, indexed attributes resides in DataMemory so you need to have enough DataMemory.The important when tuning the DiskPageBufferMemory is to get a good hit ratio. If you use ClusterControl you will see the hit ratio (picture coming soon).4096M is a good start.UNDO_BUFFER_SIZE=32M  or 64M (same size as you have for the RedoBuffer)This is specified in the CREATE LOGFILE GROUP ... UNDO LOG space should be 0.5 x NoOfFragmentLogFileSize x 4 x FragmentLogFileSize (this may take it to the extreme but better safe than sorry).[NDBD DEFAULT]  DiskIoThreadPool=8    (2 is default)   UndoIndexBuffer/UndoDataBuffer  - those parameters are deprecated but still in the manual (I have grepped the 7.2.4 source and find no trace of it. It was a long time those were used. When LCP was done different in Cluster. So no need to set these.Please not that you can always add more UNDO LOG files online by issuing ALTER LOGFILE GROUP  ADD UNDOFILE 'undo_X.dat INITIAL_SIZE=1024M ENGINE=NDBCLUSTER;However, if the Cluster crashes, then it is too late.Disk LayoutPut the Redo log + LCP  on separate disk spindles ( A)Put the Undo log on separate disk spindles  (FileSystemPathUndoFiles) (B)Put the Data files (table space) on separate disk spindles (FileSystemPathDataFiles) (C)In many cases you can combine B and C on the same disks, and if you have a good disk array, you can of course combine A,B and C. Bottom line is, watch your disks and utilized they are. That will determine if you have to make disk reconfigurations. Also, beware of trial and error before getting the DiskPageBufferMemory and SharedGlobalMemory right.

  • Different type of threads used by MySQL
    MySQL uses different type of threads for running specific utilities like mysql_install_db, flushes MyISAM tables, Replication, TCP/IP sockets etc. All threads can run with four different priorities like Interrupt, connect, wait and query priorities. Different types of threads are: The … Continue reading → No related posts.

  • Log Buffer #269, A Carnival of the Vanities for DBAs
    There is no replacement for the documentation of the database products. There is no alternative of forums for these technologies, and also there is no alter ego for the database blogs. They add up yet another avenue to explore when the professionals need help. This Log Buffer Edition helps professional in identifying few blog posts [...]

  • Percona Server 5.1.62-13.3 released!
    Percona is glad to announce the release of Percona Server 5.1.62-13.3 on April 25, 2012 (Downloads are available here and from the Percona Software Repositories). Based on MySQL 5.1.62, including all the bug fixes in it, Percona Server 5.1.62-13.3 is now the current stable release in the 5.1 series. All of Percona‘s software is open-source and free, all the details of the release can be found in the 5.1.62-13.3 milestone at Launchpad. New option rewrite-db has been added to the mysqlbinlog utility that allows the changing names of the used databases in both Row-Based and Statement-Based replication. This was possible before by using tools like grep, awk and sed but only for SBR, because with RBR database name is encoded within the BINLOG ‘….’ statement. The full release notes can be found in our online documentation.

  • Percona Server 5.5.22-25.2 released!
    Percona is glad to announce the release of Percona Server 5.5.22-25.2 on April 25, 2012 (Downloads are available here and from the Percona Software Repositories). Based on MySQL 5.5.22, including all the bug fixes in it, Percona Server 5.5.22-25.2 is now the current stable release in the 5.5 series. All of Percona‘s software is open-source and free, all the details of the release can be found in the 5.5.22-25.1 milestone at Launchpad. Bugs Fixed: While running the test case found that MEMORY engine may return rows in non-deterministic order for equal keys. Bug fixed #892951 (Laurynas Biveinis). Full release notes are available in the documentation.

  • Sizing EC2 Servers to get more Bang for the Buck
    To launch an app there was a tried and true process. The process in the past was to rent a server in a managed hosting facility. Get a few more as you grow, then build a model showing that it would be cheaper to get your own cage and finance servers. If the app continues to grow, build a model showing it would cost cheaper and allow for more rapid growth if you built your own datacenter after multiple requests for more power, space and moving to larger spaces with more power.  Now we have a new Step. If Step 1 is managed hosting then Step 0 is putting your application in the cloud i.e. Amazon's EC2.When an app grows cost grows at a multiple in EC2. To offset these costs, take periodic looks at your architecture to see if there is anything you can do to reduce overall cost. Let's look at a tier that is easy to scale. Let's look at application memory caching on Memcache as an example to illustrate reducing cost.To scale network caches, add more memory. Make sure that the throughput of data going through the system doesn't exceed your instance network bandwidth capacity and generally, you are good. (There are CPU concerns at high concurrency and pulling many bits at a huge rate). Simplifying things the cost to scale Memcache is really the cost per GB of memory in EC2.Sizing your Memcache pool is based on a number of factors. What is being cached? How is the cache used? How many distinct items and how large is each item on average. For a dynamic app that caches database rows let’s base the cache on reducing repetitive queries to the database for users hitting the site. So the size should be based on how many daily active users the application has. The model is a function of daily active users (DAU) cacheable data size.  If the database is 200GB from 1 million users and 10K users use the site per day, cache 10K users worth of data instead of 200GB.  Northern California DC prices us-west1 - has the best uptime and costs the most:m1.large Amazon's Large Instance Server has 7.5 GB of memory and costs $0.36 per hour per server. There are 744 hours in a 31-day month so this m1.large costs roughly $268 per month.m1.xlarge is 15GB but has eight Compute Cores - you don't need eight so lets skip that and look at the High memory plan.m2.xlarge has 17.1 GB of memory and 6.5 Compute Cores and costs roughly $376 per month.m2.2xlarge has  34.2 GB of memory and costs  $752m2.4xlarge has  68.4 GB of memory and costs $1505So the impulse might be lets get a bunch of m1.large since they are the cheapest at $268, but in actuality the price is not the cheapest for the scale needs.m1.large costs roughly 5 cents per hour per GB while the high memory family costs 3 cents per hour per GB. Since the other scale condition is to not saturate your network bandwidth, having more points of access solves this with the additional benefit of loosing smaller sections of cache when a server goes down. (This is handwaving assuming the instances are not on the same server/rackswitch etc).Therefore, the best bang for the buck is m2.xlarge (m2 family) at 3 cents per GB per hour for a monthly cost of $376. This is only 40% more then m1.large cost per month for 2.26 times the memory of m1.large. Now if the application doesn't require 2.26 times the memory and its not being utilized then its not worth spending more money.

  • SELF 2012 Preliminary Schedule
    The SELF 2012 Preliminary Schedule is now available. Just 43 days left until SELF and it looks like a great MySQL Lineup! A wide array of presenters from the MySQL community will be present, including Oracle Engineers. On top of that.... You can get in for FREE if you wanted too! Register here Join us and support Linux, MySQL and open source technologies over all.SELECT COUNT(sessions) FROM self_mysql_speakers WHERE year=2012;20+ rows

  • Amateurs – They give us professionals a bad name
    Any person with half a brain would see from the error messages below that the MySQL server is not operating optimally, or more specifically the MySQL upgrade has not completely successfully and let users can go happily use the website. It amazing me when web hosting providers tell their paying client that an upgrade has been performed yet they did not have the intelligence to actually look at the error log for confirmation. Got a mysql> prompt, it’s all good. One of the first things I check is the error log. When will people learn the MySQL error log is a valuable resource both for what it contains, and what it should not contain. 120426 17:36:00 [Note] /usr/libexec/mysqld: Shutdown complete 120426 17:36:00 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended 120426 17:36:00 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 120426 17:36:00 [Note] Plugin 'FEDERATED' is disabled. /usr/libexec/mysqld: Table 'mysql.plugin' doesn't exist 120426 17:36:00 [ERROR] Can't open the mysql.plugin table. Please run mysql_upgrade to create it. 120426 17:36:00 InnoDB: The InnoDB memory heap is disabled 120426 17:36:00 InnoDB: Mutexes and rw_locks use GCC atomic builtins 120426 17:36:00 InnoDB: Compressed tables use zlib 1.2.3 120426 17:36:00 InnoDB: Using Linux native AIO 120426 17:36:00 InnoDB: Initializing buffer pool, size = 128.0M 120426 17:36:00 InnoDB: Completed initialization of buffer pool 120426 17:36:00 InnoDB: highest supported file format is Barracuda. 120426 17:36:00 InnoDB: Waiting for the background threads to start 120426 17:36:01 InnoDB: 1.1.8 started; log sequence number 232577699 120426 17:36:01 [ERROR] Missing system table mysql.proxies_priv; please run mysql_upgrade to create it 120426 17:36:01 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist 120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_current' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_history' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_history_long' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'setup_consumers' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'setup_instruments' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'setup_timers' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'performance_timers' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'threads' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_thread_by_event_name' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_summary_by_instance' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'events_waits_summary_global_by_event_name' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'file_summary_by_event_name' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'file_summary_by_instance' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'mutex_instances' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'rwlock_instances' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'cond_instances' has the wrong structure 120426 17:36:01 [ERROR] Native table 'performance_schema'.'file_instances' has the wrong structure 120426 17:36:01 [ERROR] Column count of mysql.db is wrong. Expected 22, found 20. Created with MySQL 50077, now running 50523. Please use mysql_upgrade to fix this error. 120426 17:36:01 [ERROR] mysql.user has no `Event_priv` column at position 29 120426 17:36:01 [ERROR] Cannot open mysql.event 120426 17:36:01 [ERROR] Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler. 120426 17:36:01 [Note] /usr/libexec/mysqld: ready for connections. Version: '5.5.23-cll' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL) by Atomicorp 120426 17:46:01 [ERROR] Missing system table mysql.proxies_priv; please run mysql_upgrade to create it 120426 17:46:01 [ERROR] Can't open and lock privilege tables: Table 'mysql.servers' doesn't exist 120426 17:46:01 [ERROR] Column count of mysql.proc is wrong. Expected 20, found 16. Created with MySQL 50077, now running 50523. Please use mysql_upgrade to fix this error. Some more light reading at Have you checked your MySQL error log today? and Monitoring MySQL – The error log

  • Books vs. e-Books for DBA's
    As most people still do I learned to read using books. WhooHoo!Books are nice. Besides reading them they are also a nice decoration on your shelf. There is a brilliant TED talk by Chip Kidd on this subject.But sometimes books have drawbacks. This is where I have to start the comparison with vinyl records (Yes, you're still reading a database oriented blog). Vinyl records look nice and are still being sold and yes I also still use them. The drawback is that car dealers start to look puzzeled if you ask them if your new multimedia system in your car is able to play your old Led Zeppelin records. The market for portable record players is small, and that's for a good reason.The problem with books about databases is that they get old very soon. The MySQL 5.1 Cluster Certification Study Guide was printed by lulu.com which made it possible to quickly update the material. This made sure that the material wasn't outdated when you bought it.I like to use books as reference material, but I tend to use Google more often and the books stay on the bookshelf and are getting old and dusty. One of the reasons for this is that taking books with me just for reference is not an option judging by the weight of it.At Percona Live UK I got a voucher from O'reilly to get a free e-Book. So I chose 'SQL and Relational Theory'. I started to read it on my laptop with FBreader and on my iPhone using Stanza. Both my phone and laptop are not really made for reading. So I bought an Sony Reader, which is made for reading.Reading 'SQL and Relational Theory' on the Sony Reader is nice. The only annoyance is that the examples are like this:SELECT COUNT(*) | SELECT COUNT(col1)FROM tbl1       | FROM tbl1And with line wrapping it looks like this:SELECT COUNT(*) | SELECT COUNT(col1)FROM tbl1       | FROM tbl1Which is not very readable.The book is very theoretical as you might expect, but nonetheless it's a very good read.The Sony Reader is not very suitable for reading whitepapers in PDF format  as most whitepapers are in A4 or Letter format which is too big for the device. Of course software like Calibre can covert some of those.(Oracle, Percona, others… please also publish your whitepapers in a format more suitable for an eReader)The device itself is very nice. The battery time and e-Ink display are good (especially if you compare them with an tablet).Unfortunately it doesn't increase my reading speed and it doesn't give me more time to read.I'm looking forward to read some other database books in e-Book format.  I think the next one on my list is High Performance MySQL.I planned to publish this post when I finished reading SQL and Relation Theory, but I thought now might be a better time as O'Reilly has a discount for that book and other books by C.J. Date.The Sony Reader runs a modified Android (yes it's possible to root it to play angry birds on it). It also has a webbrowser, but itsn't well suited for reading Planet MySQL or Planet MariaDB. Using the webbrowser to download the MP3 for the OurSQL Podcast and then playing it does work flawlessly. I tried to download the EPUB file for the MySQL Reference Manual, but that failed so I used USB for that.

  • Interest building in Percona XtraDB Cluster
    In the last few weeks I’ve been caught off guard by the number of people who’ve told me they have been evaluating Percona XtraDB Cluster (link), and even more surprised at the projects they’re considering it for. Yesterday alone I spoke to several people who have been evaluating it for large, mission-critical enterprise deployments. Some new, some to replace existing systems that use standard MySQL replication. What was interesting is that some people said they’ve been putting it through its paces for months — before we even released it as GA. Another person said he was evaluating it and tried a bunch of things like killing nodes, and it “just worked.” He sounded like he’d been suspicious: had it REALLY worked? But then, on further investigation, he was able to confirm that yes, it had just worked. The node went away; the cluster as a whole was healthy and happy. It’s funny how you can get a feeling about the momentum on a product or idea or event. It probably builds upon the little things, like tone of voice or expressions on peoples’ faces. In any case I have this gut feeling about Percona XtraDB Cluster: it’s about to happen. Further Reading:Building a MySQL server with XtraDB for speed Sessions of interest at the Percona Performance Conference What’s different about XtraDB? Xtrabackup is for InnoDB tables too, not just XtraDB Book Review: Building powerful and robust websites with Drupal 6

  • Collaborate 2012 – Day 4
    Last day of Collaborate 2012 and Scott Spendolini, Sumneva, gave a great presentation on APEX. Only caught the beginning Jan Visser’s Perl presentation because of the distance to the Luxor from the Mandalay South Conference Center and anticipated queuing time for checkout. We can now look forward to Collaborate 2013 in Denver, Colorado. Back to observing and working with code, here’s a nice article from MacWorld on how you set up a WebDAV on the Mac. While I’m mentioning Mac OS X and development, there’s still no firm upgrade window for the missing text editing tool – TextMate, and WWDC 2012 tickets sold out in two hours.

  • Come present at MySQL Connect
    MySQL Connect will be held this September before Oracle Open World and time is running out for you to submit your presentation. There are tracks on Application Development, Architecture and Design, Cloud Computing, Database Administration, High Availability, and Performance & Scaling. You’ll need between 15-40 slides for an hour presentation (sixty minute session, leave ten or so minutes for Q&A). The submission software lets you stop and save before submission, just in case you need to do some research while create your submission. And once you submit, you can go back and edit so that you can include the latest and greatest information. But May 6th is the last day submissions will be accepted. Good luck and I hope to see you in San Fransisco next September.

  • My First Collaborate
    This has been an interesting week. I have been told over the past year that the Collaborate conference was more Oracle focused and little interest in MySQL. While the conference is Oracle focused, MySQL and open source technologies do have people's interest. I was informed that the MySQL sessions all had larger attendees than last year overall. I did meet some great people here and a big thank you to the IOUG, OAUG, QUEST for this event. Some typical comments I heard from attendees about MySQL while at Collaborate: “SQL Server just does not have the throughput, I know MySQL can do it.”“Depending on what your application is, MySQL is just better and faster.”The biggest hurdle for MySQL this week was how to pronounce it: “MySQL” is “My Ess Que Ell” (not “my sequel”), otherwise MySQL was well received. Collaborate will be in Denver next year so I am surely to attend. I look forward to presenting how impressive MySQL 5.6 is and having more MySQL sessions or a track. Below are some links for Oracle DBAs looking at MySQL, I hope you find them helpful. 50 things to know before migrating Oracle to MySQL http://www.xaprb.com/blog/2009/03/13/50-things-to-know-before-migrating-oracle-to-mysql/http://www.xzilla.net/blog/2009/Apr/Taking-a-stab-at-50-things-to-know-before-migrating-MySQL-to-Oracle.htmlhttp://scale-out-blog.blogspot.com/2009/04/overcoming-mysql-to-oracle-culture.html MySQL for Oracle DBAs, Oracle Developers: http://mysqluc.com/presentations/mysql06/bradford_ronald.pdfhttp://www.markleith.co.uk/dl/MySQL_for_Oracle_DBAs.pdfhttp://mysql-dba-journey.blogspot.com/2009/07/differences-between-oracle-and-mysql.html Pythian: http://www.pythian.com/news/17895/mysql-troubleshooting-for-the-obsessive-oracle-dba/http://www.pythian.com/news/13369/notes-on-learning-mysql-as-an-oracle-dba/ MySQL for Oracle DBAs (I, II): http://marist89.blogspot.com/2005/09/mysql-to-oracle-dba-part-i.htmlhttp://marist89.blogspot.com/2005/09/mysql-to-oracle-dba-part-ii.html MySQL: An Introduction for Oracle DBAs http://patrickhurley.wordpress.com/mysql-for-oracle-dba/

  • Google fighting against regular content as well?
    Whatever Google intended to do with the latest update, I personally think there was some collateral damage done to sites like mysqljoin.com . Since this is only a site I created to show people how joins can be used within MySQL (for free), I never spent money to a SEO agency or something. All I did was installing a SEO Plugin for WordPress. Am I guilty of “over optimizing” now? The site is approx. 1,5 years old and it grew slowly. The first few month nothing really happened, then people somehow started to link to us , people on stackoverflow.com started to reference us and things like that. To me, this looks really organic. However, the traffic dropped overnight from approx. 1500 unique visitors per day to approx. 300 unique visitors. If this update works to fight again spammy sites in the organic results, I’m fine with it. Even though I have no idea why Google penalized us (no message in webmaster tools or the like), it looks like it makes some form of “reverse SEO” possible: Instead of focussing on improving your own search engine ranking that much to outperform the competition, simply attack the competition by whatever Google now obviously doesn’t tolerate any more. Honestly, do you as a reader thing the MySQL Join Tutorials are spam? Is it “valuable content” as it’s called by Google? A lot of people who commented on the posts seem to be happy about them. We’re open to new ideas! What would you love to see as a tutorial as well? What don’t you like about our current tutorials. Any feedback is more than welcome. Thank you very much, Jan

  • With InnoDB’s Transportable Tablespaces, Recovering Data from Stranded .ibd Files is a Thing of the Past
    Being a data recovery specialist and having recovered countless GBs of corrupted, and/or stranded, InnoDB data in my days, I am very happy to hear about the new InnoDB Transportable Tablespaces coming in MySQL 5.6! Back in the day, if you had a stranded .ibd file (the individual InnoDB data file with –innodb-file-per-table option), you basically had nothing (even though that file contained all of the data). This was because unless you had the original instance that that particular .ibd file (table) originated from, there was no way to load it, import, or dump from it. So it was not of much use, though all the data was *right* there. Thus I created the method of Recovering an InnoDB table from only an .ibd file (I should note that this was before the InnoDB Recovery Tool had been released, which can also be used to recover data from a stranded .ibd file too). However, if you’ve used either my method or the InnoDB Recovery Tool for such a job, it can be a bit of work to get the data dumped. For those experienced, it goes much faster. But still, you cannot get any faster than just being able to (roughly) import the individual tablespace right into any running MySQL 5.6 instance. Nice work! Note: Again, I must mention this is only in MySQL 5.6, so if you have a stranded .ibd file you need to recover data from pre-5.6, you’ll either need to use my method or the InnoDB Recovery Tool.  

  • Bulletproofing MySQL replication with checksums
    Read the original article at Bulletproofing MySQL replication with checksumsYour MySQL replications running well? You might not even know if they aren’t. One of the scariest things about MySQL replication is that it can drift out of sync with the master “silently”. No errors, no warnings.What and Why?MySQL’s replication solution evolved as a statement based technology. Instead of sending actual block changes, MySQL just has to log committed transactions, and reapply those on the slave side. This affords a wonderful array of topologies and different uses, but has it’s drawbacks. The biggest occur when data does not get updated or changed in the same way on the slave. If you’re new to MySQL or coming from the Oracle world you might expect that this would flag an error. But there are many scenarios in which MySQL will not flag an error:mixed transactional and non-transactional tablesuse of non-deterministic functions such as uuid()stored procedures and functionsupdate with LIMIT clauseThere are others but suffice it to say if you want to rely on your slave being consistent, you need to check it!The solution – mathematical checksumsIf you’re a seasoned Linux user, you’re probably familiar with the md5sum command. It creates a checksum on a file. You can do so on different servers to compare a file in a mathematically exact way. In fact rsync uses this technique to efficiently determine what files or pieces of files need to be copied across a network. That’s what makes it so fast!It turns out that MySQL can checksum tables too. However were we to build our own solution, we might have trouble doing so manually as table data is constantly in a state of flux.Enter Percona’s pt-table-checksum tool formerly part of Maatkit. Run it periodically against your master schemas or the entire instance if you like. It will store checksums of all of your tables in a special checksum table. The data from this table then will propagate through replication to all of your connected slaves.The tool then has a check mode, which allows you to verify all the connected slaves are ok, or report the differences if it finds any.Step-by-step SetupFirst you’ll need to grab a copy of the percona toolkit. Note that if you previously installed maatkit then you may want to delete those old scripts to avoid confusion. mk-table-checksum if you used maatkit, or pt-table-checksum if you have 1.0 versions. You likely installed using wget or perl Makefile, so you may need to go and remove those manually.Assuming you’ve already got the percona repository installed issue: $ yum install -y percona-toolkitI’ve found some of the maatkit tools to be rather fussy about getting all the options right. The first thing to do which will help simplify this is to add a section in your local user’s “.my.cnf” file like this:[client]user=rootpassword=myrootpassword That way the percona tools will look for this whenever it needs authentication credentials. Otherwise we assume localhost for this example, so you should verify you can connect with the mysql client as root from localhost.Now let's checksum the "mysql" system schema. $ pt-table-checksum --replicate=test.checksum --create-replicate-table --databases=mysql localhostNote the --create-replicate-table option. You only need this option the first time. From there the test.checksum table will exist.You should see some output that looks like this: TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE04-24T16:06:45 0 0 0 1 0 0.099 mysql.columns_priv04-24T16:06:45 0 0 32 1 0 0.100 mysql.db04-24T16:06:45 0 0 0 1 0 0.096 mysql.event04-24T16:06:45 0 0 0 1 0 0.096 mysql.func04-24T16:06:45 0 0 38 1 0 0.102 mysql.help_category04-24T16:06:45 0 0 452 1 0 0.106 mysql.help_keyword04-24T16:06:46 0 0 993 1 0 0.096 mysql.help_relation04-24T16:06:46 0 0 506 1 0 0.100 mysql.help_topic04-24T16:06:46 0 0 0 1 0 0.099 mysql.host04-24T16:06:46 0 0 0 1 0 0.104 mysql.ndb_binlog_index04-24T16:06:46 0 0 0 1 0 0.107 mysql.plugin04-24T16:06:46 0 1 1 1 0 0.115 mysql.proc04-24T16:06:46 0 0 0 1 0 0.186 mysql.procs_priv04-24T16:06:46 0 1 1 1 0 0.097 mysql.proxies_priv04-24T16:06:47 0 0 0 1 0 0.097 mysql.servers04-24T16:06:47 0 0 0 1 0 0.096 mysql.tables_priv04-24T16:06:47 0 0 0 1 0 0.098 mysql.time_zone04-24T16:06:47 0 0 0 1 0 0.097 mysql.time_zone_leap_second04-24T16:06:47 0 0 0 1 0 0.100 mysql.time_zone_name04-24T16:06:47 0 0 0 1 0 0.100 mysql.time_zone_transition04-24T16:06:47 0 0 0 1 0 0.095 mysql.time_zone_transition_type04-24T16:06:47 0 1 38 1 0 0.100 mysql.userHow to check slavesOnce you've collected all those fancy checksums for your tables, nicely timestamped, you'll want to verify that your slaves are happily in sync. You can do that with the following command, also on the master: $ pt-table-checksum --replicate=test.checksum --replicate-check-only --databases=mysql localhostIf there's no differences you'll see no output. If you have a difference it'll look something like this: Differences on ip-10-15-27-19TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARYmysql.user 1 1 1In our case you can see we created some users on the slaves accidentally, hence the differences. It illustrates how easy it is for differences to creep into your environment and also how easy it now is to find them!Special CasesSince one of my clients uses Drupal, they've had trouble replicating the semaphore table. This tables is a MyISAM table, and unfortunately no one dares convert it to InnoDB. So from time to time some gunk builds up in there, and it fails on the slave. We could clean out the table, but we decided to just filter out this one table. Since Drupal doesn't use fully qualified schema.table names in it's code, only "use" we have found this to be safe.However the percona toolkit explicitely checks for replication filters and will not run. It'll stop with an error as follows: $ pt-table-checksum --replicate=test.checksum --databases=sean --ignore-tables=semaphore localhost04-24T15:59:29 Replication filters are set on these hosts:ip-10.15.27.19replicate_ignore_table = sean.semaphoreip-10-15-27-72replicate_ignore_table = sean.semaphoreip-10-15-27-18replicate_ignore_table = sean.semaphorePlease read the --check-replication-filters documentation to learn how to solve this problem. at /usr/bin/pt-table-checksum line 6166.The solution is the --nocheck-replication-filters option. Keep in mind that this sanity check is there for a reason, so be sure to skip the relevant tables in your checksum building, and checksum checks.To build checksums skipping the semaphore table use this command: $ pt-table-checksum --replicate=test.checksum --ignore-tables=prod.semaphore --nocheck-replication-filters localhostNow you can check your slaves but ignore the semaphore table: $ pt-table-checksum --replicate=test.checksum --replicate-check-only --ignore-tables=prod.semaphore --nocheck-replication-filters localhostWe also found a bug which preventing us from specifying multiple tabes on the ignore-tables line. So we used multiple invocations to do different schemas like this: $ pt-table-checksum --replicate=test.checksum --replicate-check-only --databases=prod1 --ignore-tables=prod1.semaphore --nocheck-replication-filters localhost$ pt-table-checksum --replicate=test.checksum --replicate-check-only --databases=prod2 --ignore-tables=prod2.semaphore --nocheck-replication-filters localhostCrash ProtectionIf you've used MySQL replication for any length of time, you've probably seen a server crash. MySQL replication can have trouble restarting if you're using temporary tables, as they'll be missing upon restart. Also MySQL before 5.5 leaves syncing the info files to the operating system. So they may be incorrect after a crash.1. Use MySQL 5.5 if possibleThere are some new parameters in 5.5 that protect the info files from a crash. These are a great addition, and will make your slave databases more bulletproof. sync_master_info = 1sync_relay_log = 1sync_relay_log_info = 12. Don't use temporary tablesAfter a restart they'll simply be gone, so queries requiring or running against them will fail.The Percona guys' new book High Performance MySQL third edition, suggests an alternate solution to using temporary tables. Use a special schema to hold your temp data, but create them as normal permanent tables. Be sure your application creates them with unique names, using the connection_id() for example. Lastly have a cleanup process drop tables periodically, based on closed connection_ids.For more articles like these go to iHeavy, Inc +1-212-533-6828

  • Old repositories
    Just as one does, I was finding stuff to delete and I stumbled across this old directory which had dozens of BitKeeper work directories. Of course, I don't have a valid license for BitKeeper anymore and these directories haven't been touched in years! I guess I shall sit on them for a while.

  • Collaborate 2012 – Day 3
    Virtualization is important and Dave Welch from the House of Brick gave a great presentation of experiences with VMWare and Tier 1 databases. It was a comprehensive presentation, but the white paper was easier to follow. The slides were complete but the volume of information was a lot for an hour presentation. Well worth the time though. Utah Oracle User Group (UTOUG) announced a call for Fall Symposium papers today. The Fall Symposium will be in Salt Lake City on 9/6/2012. If you’re interested in presenting on Oracle or MySQL, the call for presentations will be open until 6/15/2012. The conference party was tonight, and it provided some nice orderves and pizza. The theme was a return to 1980s music, and some folks really dressed their parts. You can listen to a short snapshot of the band by clicking the image to launch a small video segment. I’m looking forward to the APEX Behind the Scenes presentation at 8:30 a.m. tomorrow. When the conference is over, I won’t miss the smoke filled air that we walk through from the Luxor to the Mandalay. It’s really amazing that the complex is more than a mile in length. It runs from the Luxor to the Mandalay South Conference Center.

  • An old way of sandboxing MySQL
    While preparing for the yesterday's MySQL/MariaDB 5.5 talk at the Mail.Ru technical forum, I downloaded the source code of three 5.5 forks - Oracle, Percona and MariaDB. And was happy that my g'old way of sandboxing an instance without installing it still works. I don't know whether MySQL sandbox is using this approach under the hood, here it goes: 1. Make sure /etc/mysql/my.cnf is not present or commented out. 2. Create a ~/.my.cnf with few important lines: [client] port = 3307 host = 127.0.0.1 socket = /opt/local/var/mysql/mysql.sock [mysqld] gdb # this one is necessary just to sandbox max_allowed_packet=16M port=3307 socket=/opt/local/var/mysql/mysql.sock language=/home/kostja/work/mariadb/5.5/sql/share/english character-sets-dir=/home/kostja/work/mariadb/5.5/share/charsets basedir=/home/kostja/work/mariadb/5.5 datadir=/opt/local/var/mysql server_id=1 These two steps are not strictly necessary, but they allow you to avoid the "mysql will choose the most appropriate cnf file" foo. Of course, you need to make sure that all paths in the configuration file point to correct locations at the source tree, and the data dir exists and is writable. 3. Now we need to populate the data directory. Here's how: Fire up shell> mysqld --bootstrap type create database mysql type ctrl-d. You could do the same in step 4, it's just fun that when nothing else works you can send queries to mysqld using the standard input. 4. Now let's restart mysqld with --skip-grant-tables, create all the necessary system tables and fill them with data: shell> mysqld --skip-grant-tables ...firing up the mysql command line client: shell> mysql use mysql -- the below scripts don't choose the default database source /home/kostja/work/mariadb/5.5/scripts/mysql_system_tables.sql source /home/kostja/work/mariadb/5.5/scripts/mysql_system_tables_data.sql Once this is all done, we can restart mysqld with no extra switches, issue the necessary grants, and get it going.

  • MySQL at Collaborate 12 — Day 3
    Collaborate 12 is in Vegas and Collaborate 13 will be in Denver. Mark your calendars now! The last MySQL session for this year’s Collaborate is over. There is another day of sessions to go but they will go on without MySQL-centric topics. The majority of those the MySQL Community Team have talked to are Oracle professionals looking to expand their support for existing MySQL instances in their organizations or those looking to move some aspect of their data over to MySQL. Many use Collaborate for obtaining technical information as they feel Oracle Open World is too frenzied to get niyty gritty details. Attendance at MySQL sessions were much larger than last year and demo pod booth action was heavy with those seeking specific answers to their questions. And they were seeking details on the NoSQL/SQL InnoDB/NDB interface, operational best practices, and MySQL cluster. In 2013, Collaborate will be in Denver with a MySQL hungry audience. Hopefully you will be there next year!

  • Testing STEC SSD MACH16 200GB SLC
    Following my previous benchmark of Samsung 830, today I want to show results for STEC MACH16 SATA card, 200GB size, this card is based on SLC, and regarding STEC website, it is an enterprise grade storage. For tests I use sysbench fileio, 16KiB block size (to match workload from InnoDB, as this is primary usage for me), and recently I switched to use async IO mode. There are two reasons for that. First, MySQL/InnoDB uses async writes, so this will emulate database load, and second, async mode allows to see maximal possible throughput, it does not show reliable latency though, as it appears there is no a reliable way in the Linux asynchronous IO library to get time metrics for particular IO block. so my testing command line looks like: sysbench --test=fileio --file-total-size=${size}G --file-test-mode=rndwr --max-time=18000 --max-requests=0 --num-threads=$numthreads --rand-init=on --file-num=64 --file-io-mode=async --file-extra-flags=direct --file-fsync-freq=0 --file-block-size=16384 --report-interval=10 run You may see I gather metrics every 10 sec to see how stable the performance is, and it really helps to observe some artifacts, as you will see in following graphs. Hardware for tests: HP ProLiant DL380 G6, filesystem: ext4, mounted with nobarrier. The results for random write case (8 async IO threads): In general it shows stable throughput topping to 148 MiB/sec, but every 20 min, there is small drop to 87 MiB/sec, which I guess is related to internal garbage collector activity. The results for random read case: Very stable throughput on line 222 MiB/sec To understand better what kind of response time we should expect, I ran random read sync IO mode, now for 1-64 threads. The throughput: We are getting to the peak throughput at 8 threads. And response time: For 8 threads, we may expect 0.62ms response time. In general I have very good experience with this card, and it seems suitable to work with MySQL. I will publish sysbench oltp benchmarks running MySQL on RAID10 over 4 STEC MACH16 cards. If you are interested more in SSD and MySQL questions – I will be giving a webinary “MySQL and SSD” on May-9. It will be the same as my talk on Percona Live MySQL Conference 2012, if you did not attend my talk – you are welcome to join the webinar. Disclaimer: This benchmark is done as part of consulting work for STEC, but this post is totally independent and fully reflects our opinion. Follow @VadimTk

  • YEAR(2) Challenge
    You might remember this little thing called the Y2K bug.  That’s twelve years ago, but MySQL still supports a YEAR(2) data type.  A recent internal discussion made me realize there were aspects of YEAR(2) that I didn’t fully understand, and prompts me to challenge readers to come up with a truly viable use case for YEAR(2) data types. A primary reason for storing only the last two digits of years is to save on storage space, so perhaps YEAR(2) makes sense for big data or in situations where storage is constrained, right?  Well, no – the number of bytes used to store YEAR(2) data is the same as what is required when YEAR(4) is used: mysql> SHOW CREATE TABLE y2\G *************************** 1. row *************************** Table: y2 Create Table: CREATE TABLE `y2` ( `y` year(2) DEFAULT NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> SHOW CREATE TABLE y4\G *************************** 1. row *************************** Table: y4 Create Table: CREATE TABLE `y4` ( `y` year(4) DEFAULT NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> SELECT COUNT(*) FROM y2; +----------+ | COUNT(*) | +----------+ | 33554432 | +----------+ 1 row in set (0.00 sec) mysql> SELECT COUNT(*) FROM y4; +----------+ | COUNT(*) | +----------+ | 33554432 | +----------+ 1 row in set (0.00 sec) D:\mysql-5.5.20-win32\data\test>dir *.MYD Volume in drive D is Data Volume Serial Number is 4015-B2FF Directory of D:\mysql-5.5.20-win32\data\test 04/25/2012  10:43 AM       234,881,024 y2.MYD 04/25/2012  10:45 AM       234,881,024 y4.MYD Huh. Well, that doesn’t make much sense, since YEAR(2) only stores the last two digits, while YEAR(4) stores all four – right?  Nope, YEAR(2) stores the full four digits, and has the same range of values as YEAR(4): mysql> TRUNCATE y2; Query OK, 0 rows affected (0.09 sec) mysql> TRUNCATE y4; Query OK, 0 rows affected (0.05 sec) mysql> INSERT INTO y4 VALUES (1901), (2000), (2155); Query OK, 3 rows affected (0.02 sec) Records: 3  Duplicates: 0  Warnings: 0 mysql> SELECT * FROM y4 ORDER BY y; +------+ | y    | +------+ | 1901 | | 2000 | | 2155 | +------+ 3 rows in set (0.00 sec) mysql> INSERT INTO y2 SELECT * FROM y4; Query OK, 3 rows affected (0.01 sec) Records: 3  Duplicates: 0  Warnings: 0 mysql> SELECT * FROM y2 ORDER BY y; +------+ | y    | +------+ |   01 | |   00 | |   55 | +------+ 3 rows in set (0.00 sec) mysql> ALTER TABLE y2 MODIFY COLUMN y YEAR(4); Query OK, 0 rows affected (0.13 sec) Records: 0  Duplicates: 0  Warnings: 0 mysql> SELECT * FROM y2 ORDER BY y; +------+ | y    | +------+ | 1901 | | 2000 | | 2155 | +------+ 3 rows in set (0.00 sec) OK, so maybe one would use YEAR(2) because they really, really like the fun behavior that assumes that values 70-99 are 1970-1999, and values 01-69 are years 2001-2069: mysql> TRUNCATE TABLE y2; Query OK, 0 rows affected (0.02 sec) mysql> ALTER TABLE y2 MODIFY COLUMN y YEAR(2); Query OK, 0 rows affected (0.03 sec) Records: 0  Duplicates: 0  Warnings: 0 mysql> INSERT INTO y2 VALUES (69), (70); Query OK, 2 rows affected (0.00 sec) Records: 2  Duplicates: 0  Warnings: 0 mysql> SELECT * FROM y2 ORDER BY y; +------+ | y    | +------+ |   70 | |   69 | +------+ 2 rows in set (0.00 sec) mysql> ALTER TABLE y2 MODIFY COLUMN y YEAR(4); Query OK, 0 rows affected (0.06 sec) Records: 0  Duplicates: 0  Warnings: 0 mysql> SELECT * FROM y2 ORDER BY y; +------+ | y    | +------+ | 1970 | | 2069 | +------+ 2 rows in set (0.00 sec) Except that behavior is identical to YEAR(4): mysql> TRUNCATE TABLE y4; Query OK, 0 rows affected (0.02 sec) mysql> INSERT INTO y4 VALUES (69), (70); Query OK, 2 rows affected (0.02 sec) Records: 2  Duplicates: 0  Warnings: 0 mysql> SELECT * FROM y4 ORDER BY y; +------+ | y    | +------+ | 1970 | | 2069 | +------+ 2 rows in set (0.00 sec) There’s one area where behavior is different, and that’s in handling of the numeric value zero:   mysql> TRUNCATE TABLE y4; Query OK, 0 rows affected (0.01 sec) mysql> INSERT INTO y4 VALUES (0); Query OK, 1 row affected (0.00 sec) mysql> SELECT * FROM y4; +------+ | y    | +------+ | 0000 | +------+ 1 row in set (0.00 sec) mysql> TRUNCATE TABLE y2; Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO y2 VALUES (0); Query OK, 1 row affected (0.01 sec) mysql> SELECT * FROM y2; +------+ | y    | +------+ |   00 | +------+ 1 row in set (0.00 sec) Looks the same so far, right?  Under the hood, though, it’s not: mysql> ALTER TABLE y2 MODIFY COLUMN y YEAR(4); Query OK, 0 rows affected (0.03 sec) Records: 0  Duplicates: 0  Warnings: 0 mysql> SELECT * FROM y2; +------+ | y    | +------+ | 2000 | +------+ 1 row in set (0.00 sec) So a numeric zero inserted into a YEAR(2) column will be effectively stored as 2000, while the same value inserted into a YEAR(4) will be stored as “0000″ (invalid).  So, there’s one use case: When your application has to insert numeric zero values and have them stored as the year 2000.  Probably not terribly compelling. Here’s one more not-so-compelling use case:  You really, really care about the number of bytes sent over the network, and only need the last two digits of a year.  As can be seen here, the number of bytes sent is lower for YEAR(2) data than YEAR(4) data – two bytes per row:   mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select * from y2 limit 1; +------+ | y | +------+ | 12 | +------+ 1 row in set (0.00 sec) mysql> show session status like 'bytes_sent'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | Bytes_sent    | 77    | +---------------+-------+ 1 row in set (0.00 sec) mysql> flush status; Query OK, 0 rows affected (0.00 sec) mysql> select * from y4 limit 1; +------+ | y    | +------+ | 2012 | +------+ 1 row in set (0.00 sec) mysql> show session status like 'bytes_sent'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | Bytes_sent    | 79    | +---------------+-------+ 1 row in set (0.00 sec) But wait, if you can save bytes there, can you save bytes in other locations, like temporary tables created for sorting?  Nope – the full data is represented in the sort, otherwise this wouldn’t sort correctly: mysql> INSERT INTO y2 VALUES (1969), (1970), (2000), (2071); Query OK, 4 rows affected (0.00 sec) Records: 4  Duplicates: 0  Warnings: 0 mysql> SELECT * FROM y2 ORDER BY y; +------+ | y    | +------+ |   69 | |   70 | |   00 | |   71 | +------+ 4 rows in set (0.00 sec) There is some funky behavior related to equality matches, though: mysql> SELECT y2.y y2, y4.y y4 FROM y2 LEFT JOIN y4 ON (y2.y = y4.y); +------+------+ | y2   | y4   | +------+------+ |   69 | NULL | |   70 | 1970 | |   00 | 2000 | |   71 | NULL | +------+------+ 4 rows in set (0.00 sec) mysql> SELECT * FROM y2 WHERE y = 1969 OR y = 2071; +------+ | y    | +------+ |   69 | |   71 | +------+ 2 rows in set (0.00 sec) mysql> ALTER TABLE y2 MODIFY COLUMN y YEAR(4); Query OK, 0 rows affected (0.05 sec) Records: 0  Duplicates: 0  Warnings: 0 mysql> SELECT y2.y y2, y4.y y4 FROM y2 LEFT JOIN y4 ON (y2.y = y4.y); +------+------+ | y2   | y4   | +------+------+ | 1969 | 1969 | | 1970 | 1970 | | 2000 | 2000 | | 2071 | 2071 | +------+------+ 4 rows in set (0.00 sec) Pretty strange, no?  I actually suspect this may be a bug and will report it as such. So YEAR(2) really does have the same underlying data storage requirements as YEAR(4), and there’s no internal optimization for sorting or anything like that – in fact, the only areas where there might be optimization seem to produce bad results, as shown in the last example.  There’s extremely limited savings in terms of network overhead, and special handling of numeric zeros.  Since the MySQL server sends the results over the wire with just the last two years, that would be moderately helpful if that’s how your application needs to consume the data – but it also causes the following behavior, which is an absolute deal-breaker for me.  First, here’s the original table creation and data: mysql> CREATE TABLE yd (y2 YEAR(2), y4 YEAR(4)); Query OK, 0 rows affected (0.03 sec) mysql> INSERT INTO yd (y2) VALUES (1969), (1970), (2000), (2071); Query OK, 4 rows affected (0.00 sec) Records: 4  Duplicates: 0  Warnings: 0 mysql> UPDATE yd SET y4 = y2; Query OK, 4 rows affected (0.01 sec) Rows matched: 4  Changed: 4  Warnings: 0 mysql> SELECT * FROM yd ORDER BY y2; +------+------+ | y2   | y4   | +------+------+ |   69 | 1969 | |   70 | 1970 | |   00 | 2000 | |   71 | 2071 | +------+------+ 4 rows in set (0.00 sec) Guess what happens when you recreate this from mysqldump? D:\>mysql-5.5.20-win32\bin\mysqldump -uroot -P3307 --add-drop-table test yd | mysql-5.1.38-win32\bin\mysql -uroot -P3307 test D:\>mysql-5.5.20-win32\bin\mysql -uroot -P3307 test -e"SELECT * FROM yd ORDER BY y2;" +------+------+ | y2   | y4   | +------+------+ |   70 | 1970 | |   71 | 2071 | |   00 | 2000 | |   69 | 1969 | +------+------+ Ouch.  That’s right, mysqldump produces 2-digit output, which – when replayed as an INSERT – prefixes the year with the centuries as described in the manual here.  The following demonstrates what’s really stored in the YEAR(2) columns after restoring using mysqldump: mysql> UPDATE yd SET y4 = y2; Query OK, 2 rows affected (0.00 sec) Rows matched: 4  Changed: 2  Warnings: 0 mysql> SELECT * FROM yd ORDER BY y2; +------+------+ | y2   | y4   | +------+------+ |   70 | 1970 | |   71 | 1971 | |   00 | 2000 | |   69 | 2069 | +------+------+ 4 rows in set (0.00 sec) Nasty stuff, and reason enough for me to never touch YEAR(2). Knowing the above, my challenge for you as a reader is to identify any compelling use case for YEAR(2).  Please feel free to offer suggestions as comments.

  • Webinar 5/2: Advanced Multi-Master Solutions Made Easy with Continuent Tungsten
    Webinar, Wednesday 5/2 @ 10 am PT/1pm ET Continuent Tungsten has broken new ground to make previously complex or completely unattainable MySQL data management solutions simple and easy to deploy. Continuent Tungsten brings high availability, performance scaling, and simple cluster management to off-the-shelf MySQL. Continuent Tungsten also includes advanced multi-master topologies that

  • Testing Samsung SSD SATA 256GB 830 – not all SSD created equal
    I personally like PCIe based Flash, but from a pricing point our customers are looking for cheaper alternatives. SATA SSD is an options. There is many products based on MLC technology, and Intel 320 I would say is the most popular. I do not particularly like its write performance – I wrote about it before, that’s why I am looking for comparable alternatives. Samsung 830 256GB looked like a good product, that’s why I decided to test it. For tests I use sysbench fileio, 16KiB block size (to match workload from InnoDB, as this is primary usage for me), and recently I switched to use async IO mode. There are two reasons for that. First, MySQL/InnoDB uses async writes, so this will emulate database load, and second, async mode allows to see maximal possible throughput, it does not show reliable latency though, as it appears there is no a reliable way in the Linux asynchronous IO library to get time metrics for particular IO block. so my testing command line looks like: sysbench --test=fileio --file-total-size=${size}G --file-test-mode=rndwr --max-time=18000 --max-requests=0 --num-threads=$numthreads --rand-init=on --file-num=64 --file-io-mode=async --file-extra-flags=direct --file-fsync-freq=0 --file-block-size=16384 --report-interval=10 run You may see I gather metrics every 10 sec to see how stable the performance is, and it really helps to observe some artifacts, as you will see in following graphs. Hardware for tests: HP ProLiant DL380 G6, filesystem: ext4, mounted with nobarrier. The results for random write case (8 async IO threads): It seems that InnoDB is not alone with its flashing problems. You can see there periodical stalls in throughput (0 throughput for 20-30 sec period of time). When there is no drops, the drive keep write throughput on 323 MiB/sec level. I really thought that these stalls are related, so I was totally surprised them in random reads also. The results for random read case: I do not have a good explanation for this. When there is no drop, the drive keeps 375 MiB/sec throughput. I may do a wild guess about drops – the drive periodically cleans an internal cache or something. To understand better what kind of response time we should expect, I ran random read sync IO mode, now for 1-64 threads. The throughput: We are getting to the peak throughput at 16-32 threads. And response time: For 16 threads, we may expect 0.96ms response time, which increases to 1.62ms under 32 threads. The periodic drops that I observe for both random reads and random writes do not allow me to recommend this drive for a database server usage, even in general this drive provides much better throughput than Intel 320 (some results for Intel 320). If you are interested more in SSD and MySQL questions – I will be giving a webinary “MySQL and SSD” on May-9. It will be the same as my talk on Percona Live MySQL Conference 2012, if you did not attend my talk – you are welcome to join the webinar. Follow @VadimTk

  • Managing MySQL Backups
    Database backups are typically critical to organizations, and are an important part of an overall disaster recovery strategy. MySQL Enterprise Backup performs online "Hot", non-blocking backups of your MySQL databases, and interfaces with media management software such as Symantec NetBackup, Oracle Secure Backup and IBM Tivoli Storage Manager to execute backup and restore operations. Two new white papers are available to help you better understand: How to Connect MySQL Enterprise Backup with Media Management systems through System Backup to Tape (SBT) interface. How to specifically use MySQL Enterprise Backup with Symantec NetBackup. Enjoy the white papers.

  • MySQL Cluster: How to load it with data
    After you have setup and properly configured MySQL Cluster you typically want to load data into it. There are a couple of ways to load data into MySQL Cluster, e.g, from dumpfiles or from csv files.If you expect loading data into MySQL Cluster will be as fast as on MyISAM then you have the wrong expectations unless you parallelize your data loading. Also, MYISAM/INNODB stores the data in local buffers/disk and in one copy only (asynchronous replication does not count as it is asynchronous) whereas MySQL Cluster (NDBCLUSTER engine) stores two copies of the data. Synchronous replication between the two copies adds ~ 2x overhead and you have network between mysql servers and data nodes.So, to load a dump file into MySQL Cluster is bound to be slower than MYISAM/INNODB. Also loading data files can be error prone.Here is what we at Severalnines usually do to avoid trial and terror of configuration options and to make sure we don't have to make X tries to load my Y GB dump file. We want to do it once, and get it right then, just because it takes time to make multiple iterations:Load in the schema in MYISAM/INNODB on one of the MySQL Servers connected to the data nodes.Please Note that your data is NOT yet stored in the NDBCLUSTER storage engine and is at this point only accessible from the mysql server where you imported the MYISAM/INNODB tables.Check for redundant indexes and remove them to reduce storage footprint. Some ORMs have a tendency for adding redundant indexes. Personally I think it is faster to make the transformations when the tables are in MYISAM/INNODB than in NDBCLUSTER, especially if there are data in the tables.You can use Percona's pt-duplicate-key-checker  (wget percona.com/get/pt-duplicate-key-checker). See an example below at the end. It makes it dead easy to find and drop the redundant indexes. Redundant indexes are good to remove also because writes will be faster (less indexes to update) and to reduce the risk that the optimizer will pick the wrong index.Check data types - do you have BLOB/TEXT where you really can use VARBINARY/VARCHAR ? BLOBs/TEXTs can slow down performance with a factor 2-3x (at least). Sometimes BLOBs/TEXTs are overused in many applications.When you have washed your schema then you can simply do:ALTER TABLE tablename ENGINE=ndbcluster;If you get an error message such as:mysql> alter table events  engine=ndbcluster;ERROR 1297 (HY000): Got temporary error 1217 'Out of operation records in local data manager (increase MaxNoOfLocalOperations)' from NDBCLUSTERThen you can change the ndb_batch_size (a lower batch size will make it longer time to ALTER the table from MYISAM/INNODB to NDBCLUSTER because more roundtrips are needed between the  MySQL Server and the data nodes):mysql> set  ndb_batch_size=8*1024*1024;Query OK, 0 rows affected (0.00 sec)mysql> alter table events  engine=ndbcluster;Query OK, 67748 rows affected (4.00 sec)Records: 67748  Duplicates: 0  Warnings: 032MB is the max value for ndb_batch_size.Now, to furthermore increase the speed of the ALTER you can perform two or more ALTERs in parallel from two or more connections:mysql conn 1 > alter table events     engine=ndbcluster;mysql conn 2 > alter table employees  engine=ndbcluster;mysql conn 3 > alter table randomdata engine=ndbcluster;Finally do:ANALYZE TABLE ;on all tables to build index statistics (MySQL Cluster 7.2 will love you for this and you will hate MySQL Cluster 7.2 if you don't).However, if you do too many in parallel you will most likely run into problems like:Redolog files are overloaded  (you have too small redo log and too slow disks to handle the write load generated by the ALTER)Out of Redo buffer (you have too slow disks to handle the write load generated by the ALTER) -  it will probably not matter if you increase the RedoBuffer as your disk subsystem it too slow and cannot "consume" what you "produce".Out of operation records   ( increase MaxNoOfConcurrentOperations or MaxNoOfLocalOperations), at Severalnines we set this to 100000 by default, you can of course increase this but it costs memory. If you increase it you should change:MaxNoOfConcurrentOperations, then MaxNoOfLocalOperations will automatically be set to 1.1 x the value of MaxNoOfConcurrentOperations. With MaxNoOfConcurrentOperations=100000 it will take 100MB of RAM on each data node. MaxNoOfConcurrentOperations=200000 will take 200MB of RAM on each data node.If you have a CSV file you can do the same: Load the data first into MYISAM/INNODB tables and then ALTER. Why first load it into MYISAM/INNODB?Well, i think it makes it easier and gives better control. If you have a 30GB dump file and many tables you want to load, then it is easier to track errors and to make use of the ndb_batch_size parameter to make it actually load without errors, i.e, you will save time, even though you will use some extra time to first load in the data into MYISAM/INNODB:Example run of ./pt-duplicate-key-checker:./pt-duplicate-key-checker  --host=ubuntu02 --user=cmon --password=xxxxxxx --databases=test2 --tables=t1# ################################################################### test2.t1                                                                # ################################################################### a is a duplicate of PRIMARY# Key definitions:#   KEY `a` (`a`)#   PRIMARY KEY (`a`),# Column types:#  `a` int(11) not null# To remove this duplicate index, execute:ALTER TABLE `test2`.`t1` DROP INDEX `a`;# ################################################################### Summary of indexes                                                      # ################################################################### Size Duplicate Indexes   0# Total Duplicate Indexes  1# Total Indexes            2

  • Collaborate 2012 – Day 2
    It seems the Titanic is everywhere, even inside the pyramid of the Luxor hotel. While the Luxor is within the Mandalay Bay complex, it’s about a half mile walk to the conference and a half mile back. We go by the Mandalay Conference Center’s aquarium. We thought it might be interesting but at $18 an admission, we opted to pass on it. It’s amazing to have an aquarium in the desert, but it’s probably not as nice as the Monterey Bay aquarium. It was interesting to start the day listening to Rich Niemiec on partitioning tables and using Exadata in Oracle. The NoSQL (Not Only SQL) presentations were interesting, as was the upgrading of Oracle 11gR2 in an E-Business Suite environment presentation. Then, I finished the day with what’s new with the Oracle VM Server. Checking out the exhibit hall I managed to get a signed copy of Rich Niemiec’s Oracle Database 11g Release 2 Performance Tuning Tips & Techniques and a copy of MongoDB The Definitive Guide.

  • The MySQL init-script mess
    I don’t think there is a single good-quality MySQL init script for a Unix-like operating system. On Windows, there is the service facility, and I used to write Windows services. Based on that, I believe Windows has a pretty good claim to better reliability for start/stop services with MySQL. What’s wrong with the init scripts? Well, let me count the reasons! Wait, I’m out of fingers and toes. I’ll just mention the two annoying ones that I’ve run into most recently. Both are on Debian, though there is nothing especially broken about Debian’s init scripts. The first one comes from parsing my.cnf wrong and not treating pid-file and pid_file identically. The server treats them identically, thus, so should any other program that reads the my.cnf file (there’s this program called my_print_defaults… use it!). The second bug is because Debian uses two configuration files for start/stop services: the init script reads /etc/mysql/debian.cnf for no discernable reason. (I guess they never heard of using [sections] in the /etc/mysql/my.cnf file, or just reading the [mysqld] section.) So if you configure your server to place its socket in a non-default location, you have to redundantly update /etc/mysql/debian.cnf too, or the init script will fail. Duplication of configuration parameters is just stupid, period. These are fairly mundane bugs. I’ve seen literally dozens more. Part of the problem is that each distribution that packages up and redistributes MySQL tends to ship with their own init script, instead of reusing the official scripts provided by MySQL. Understandable, because mysqld_safe is generic and doesn’t really integrate well with any system’s init facilities. But man, do they reinvent a bunch of lovely bugs, mostly related to things like parsing the .cnf files, handling pid files, handling sockets, special user accounts, braindead look-before-you-leap patterns of pinging before actually doing a task, stupid timeouts, wrong handling of log files and log file rotation, dumb hacks with syslog, failing to check for real evidence of a running process (you can’t trust what a cache file on disk says!), adding facepalm-worthy CHECK TABLES automatically on every table on server startup, and on and on. The official mysqld_safe script tends to be a little less broken, in my experience, but still has many unlovely behaviors and missing features that I’d consider to be bugs. I haven’t even mentioned the “manage multiple instances” scripts yet. Boy, do those have a ton of bugs. They do stupid things like grepping configuration files for strings that may or may not be in the configuration files. I remember one emergency case where MySQL couldn’t be started on a box because the string “mysql_multi” didn’t exist in a my.cnf file clearly designed for multiple instances to run. I added a comment to the effect of “# This comment is necessary for mysql_multi to work” and the problem was solved. A sane script would actually check for multiple instance definitions, not for some arbitrary string of characters. Anyway, this is just one tiny example, I don’t mean to dwell on it. What happens when you have a bad init script? All kinds of things. You can’t shut down the server gracefully, so if you shut down the system, you hard-crash MySQL eventually, and good luck getting replication back after that in most cases. You can’t start the server correctly, or it reports the wrong thing and then tries to start several instances, and the second one borks the first one’s pid file and/or socket, causing the aforementioned shutdown problem or worse. And on it goes. My principle is usually “don’t complain, do something about it.” But there’s a problem, in this case: writing a good init script is actually a significantly complex software engineering project. It is NOT “just a script.” (Insert my usual rant about the need for an actual test suite.) And that is not something I am working on at the moment, nor has it ever become my priority for the last several years. So in this case I’m complaining, because the writing on the wall says that I am probably never going to work on this, and I’d at least like there to be some visibility about what a serious problem this is. Distribution maintainers could probably improve the situation significantly by taking a look at each other’s bug reports. If everyone solved the same bugs everyone else has solved (and don’t forget bugs in mysqld_safe, too) that would be a big step forward. Further Reading:GnuCash to MySQL export script A tweak to column alignment for the mext script A script snippet for aggregating GDB backtraces A script snippet to relative-ize numbers embedded in text What to do when MySQL says skip-innodb is defined

  • More on DynamoDB - The good part!
    In a previous post on DynamoDB, I told you we were in the process of migrating to DDB and from MongoDB for our largest datastore. Now, we have moved a bit further on this and we, including myself, has pretty positive view on DDB, it really is a viable MongoDB alternative if you can live with the limitations. That said, there are many limitations, but I would like to put this differently. I would say this is an opportunity to simplify and get real good performance from your base level data, and combine it with other technologies where appropriate.I wouldn't think that any serious application that use a database could live with DynamoDB only, unless the application developers were prepared to do just about everything database related, beyond the most simple, themselves. For example, you might need a secondary index, DDB doesn't provide you with them, so what you could do is use another DDB table as an index into the main data. Which is fine, but you have to implement it yourself, no more CREATE INDEX statement, no more ensureIndex() command and no more "the index is there so the optimizer will use it" rather "I have now an index on that previously unindexed attribute, so I rewrite my code to take advantage of it".That said, how I would like to see DDB, and this is how we use it here at Recorded Future, is as a store for low level objects, like BLOB, pieces of text, pieces of XML, collections of keywords, you name it. Then you reference that data with an id that is looked up in some kind of supporting technology, like a free-text search engine or MySQL or both.What we are looking at doing here at Recorded Future is to use DDB for just this kind of stuff. The supporting technologies are, in our case, MongoDB (yes, MongoDB, we have data in MongoDB today that will not work well in DDB, data that has secondary indexes on it, data that has uses more features in MongoDB etc) and Sphinx. But this may change. The database we are moving from MongoDB to DDB is just so simple and straightforward as is required to make it a good for for DDB.And despite the limited functionality, DDB has several advantages:It performs well, and I can pay for only the throughput I need. Actually, pricing is one of the intriguing aspects of DDB, that you pay for throughput, basically, not for storage, number of servers, number of users or something as arcane as that.It is managed by Amazon, and Amazon seems to do a good job here.DDB currently lacks any kind of backup mechanism, and as DDB isn't exposed outside the managed Amazon DDB environment, there isn't much I can do about it, so I just ignore it and tell my managers that Amazon will not allow me to back up our data (yes, I am kidding now).There are several reasonable well working APIs, Ruby, Java etc that is integrated in the same Amazon APIs as the other Amazon services (the REST based API that these are built on leaves a fair bit to be desired, as does the documentation).We are not live with DDB yet, we need to figure out a way to perform backups (as we can't get backups out of DDB, we have to find a way in out application to catch data before it enters DDB) for example, and we have coding to do, but my initial reservations regarding DDB are not as strong as they used to be, but one has to know the limitations, fave the facts and work with them. But that is life in a Cloud environment anyway.As for the DDB pricing model, should we call that Cloud-based prising? /Karlsson

  • More changes from the Facebook patch
    I pushed more changes from the Facebook patch. The patches include all changes from the excellent work by Nizam to make InnoDB compression better for OLTP. Everyone is welcome to look at the patches but my target audience for these are people who work on official MySQL, MariaDB and Percona. The changes include:add option to rely only on innodb checksum for compressed pages and avoid zlib's use of adler32 - revision 3818add linear algorithm for computing dynamic padding - revision 3820add option to use fast innodb checksum for compressed pages - revision 3824provide option in mysqldump to use START TRANSACTION WITH CONSISTENT INNODB SNAPSHOT - revision 3829save CPU by only computing compressed page checksums when needed - revisions 3816 and 3817allocate fewer free pages to reduce database file size - revision 3814support compressed pages in innochecksum - revision 3813avoid calling malloc and free for every page compression/decompression - revision 3808allocate fewer pthread structs for InnoDB buffer pool pages - revision 3811support my.cnf variables with non-integer default values - revisions 3812 and 3822

  • MySQL Connector/Net 6.3.9 has been released
    MySQL Connector/Net 6.3.9, the latest maintenance release of our 6.3 version series, has been released.  Connector/Net is our all-managed .NET driver for MySQL.  This release will be the last release of our 6.3.x series and contains more than 25 fixes from the 6.3.8 base.  Users looking for additional fixes or features should upgrade to our most recent version.  You can see the list of changes and read more about the connector at http://dev.mysql.com/doc/refman/5.5/en/connector-net.html Version 6.3.9 is appropriate for use with versions of MySQL 5.0-5.5. It is now available in source and binary form from http://dev.mysql.com/downloads/connector/net/6.3.html#downloads and mirror sites (note that not all mirror sites may be up to date at this point-if you can't find this version on some mirror, please try again later or choose another download site.) The release is also available for download on My Oracle Support (MOS). Enjoy and thanks for the support! 

  • Face the facts: There is no Big Data / Big Analytics silver bullet
    We have a lot more storage space available these days, and a lot more data to work with, so Big Data and Big Analytics is getting much more mainstream now. And there are conclusions and insights you can get from that data, any data more or less, but Web data in particular brings a new dimension when combined with more traditional, domain specific data. But this data is also mostly in the shape of plain text, like your blogs, twitters, news articles and other web content. And this in turn means that to combine your organized structures sales data for 20 years with Web data, the Web data first needs to be analyzed.Web data also brings in a new difficulty: the data is big, and it's not organized at it's core, so you can not easily aggregate or something like that to save space (and why would you want to do that?). It's not until after you have analyzed it that you know what data is interesting and what is not. And to be frank (but I am not, I'm Anders), not even then can you start to aggregate data or throw data away that isn't interesting. And in my mind, this is an mistake that has been done in all sorts of analytics, even with smaller amounts of data.When it comes to analytics, in my mind "If you think you have all the right answers, you haven't asked all the right questions". This is an important point, analytics is a recurring activity, and the more questions you get answered, the more questions you should get. And with this in mind, how can you know what to aggregate? In particular when it comes to web content?So, can we live with Web data not being aggregated and how do we do it? What database can support that? Oracle? MySQL? MongoDB? Vertica? And the answer is, in the same way as with analytics, you will not know when you start analyzing, and once you have started doing that, you will be even more in doubt! Which technology supports all the aspects you might need to look at? And the keyword is might!So, how can we solve this? And my answer is: By using the right tool for the job at hand, and be prepared to combine different tools! Postgres and Oracle are great for temporal analysis, for GIS we have Oracle, MySQL and PostGIS. For handling large amounts of data with good scalability and keeping the cost down, you might want a key-value store like MongoDB or DynamoDB. To search data you might head for Sphinx or Lucene. Etc. etc.As an example, I'd might want to look at a key-value store for my raw Web data, holding some key for easy lookup. An RDBMS for the attributes of this data. Sphinx for searching it. Sphinx and Lucene are much better tools than your average RDBMS, be it MySQL or Oracle or Whatever, and RDBMS search is different than a text search in web-data!So the most important aspect to look at, if you ask me, is to choose technologies that can easily be combined and where different aspects of data can be served by different technologies as appropriate. And be prepared to add, remove and change technologies as you go along!/Karlsson

  • When is wait_timeout not wait_timeout?
    Over the weekend I came across an extremely curious issue with MySQL. It seemed that no matter how many times I tried to set the wait_timeout, it would always show the value of interactive_timeout. I even tried restarting mysql, to no avail. Eventually I figured it out. When I was in an *interactive session*, wait_timeout displays as the value of interactive_timeout. Otherwise, it showed the appropriate value. Here’s what I found, when interactive_timeout was set to 600 and wait_timeout was set to 14400 (this is on an analytics server, so setting the value that high actually makes sense): [root@mysql1 ~]# mysql -e "show variables like 'interactive_timeout'" +---------------------+-------+ | Variable_name | Value | +---------------------+-------+ | interactive_timeout | 600 | +---------------------+-------+ [root@mysql1 ~]# mysql -e "show variables like 'wait_timeout'" +---------------+-------+ | Variable_name | Value | +---------------+-------+ | wait_timeout | 14400 | +---------------+-------+ When using non-interactive logins, like mysql -e “COMMAND”, wait_timeout has the appropriate value. However, in an interactive session, wait_timeout had the same value as interactive_timeout: [root@mysql1 ~]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 23814 Server version: 5.1.61-rel13.2-log Percona Server (GPL), 13.2, Revision 430 Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> show variables like 'interactive_timeout'; +---------------------+-------+ | Variable_name | Value | +---------------------+-------+ | interactive_timeout | 600 | +---------------------+-------+ 1 row in set (0.00 sec) mysql> show variables like 'wait_timeout'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | wait_timeout | 600 | +---------------+-------+ 1 row in set (0.00 sec) I observed this behavior with Percona Server 5.5.21-55, and with Oracle’s MySQL 5.1.61 and 5.0.77, so it is neither a new feature, nor is it limited to Percona only. Putting on my “reverse engineering” hat, my guess is that MySQL looks at the value of “wait_timeout” to decide when to timeout, and when you use an interactive session, wait_timeout is set to the value of interactive_timeout. In other words, I guess that interactive_timeout serves only to set wait_timeout for interactive sessions. I am not sure if this is a bug or a feature, but I have seen plenty of these kinds of “subtle hacks” in MySQL so it would not surprise me if this is the way it was intended to work. It’s extremely confusing to figure out though, when you try to set the variable and then check it….here is one of my frustrating sessions, where the change didn’t seem to “stick”: mysql> set global wait_timeout=14400; Query OK, 0 rows affected (0.00 sec) mysql> show variables like 'wait_timeout'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | wait_timeout | 600 | +---------------+-------+ 1 row in set (0.00 sec) mysql> exit Bye [root@mysql1 ~]# mysql Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 23810 Server version: 5.1.61-rel13.2-log Percona Server (GPL), 13.2, Revision 430 Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> show variables like 'wait_timeout'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | wait_timeout | 600 | +---------------+-------+ 1 row in set (0.00 sec) Edited to Add - My guesses were correct! Many people have pointed out, this is a documented way that this works.

  • Collaborate 2012 – Day 1
    Collaborate 2012 started on Sunday but for me I began on Monday. I enjoyed Bob Burgess, SalesForce, presentation on shell scripting for MySQL Administration today. It preceded my presentation in the same room, which I thought was an interesting coincidence since we got our conference credentials together. I presented on portable SQL between Oracle and MySQL. The presentation went well. Before I took questions, I got to ask them because I had three copies of my new Oracle Press book to give away: Oracle Database 11g and MySQL 5.6 Developer Handbook. Handing out the books served as a nice ice breaker for the audience to ask questions about the presentation. My favorite question was, “Will Oracle continue to improve MySQL?” My answer to that is always simple because Oracle’s support for MySQL has and continues to be great, “Oracle only spends money on winners and that means MySQL wins.” Oracle product management was in attendance and they re-enforced Oracle’s commitment to MySQL. At 6 p.m., the Exhibit Hall opened and I checked it out. Cisco hired Kathy Bailey to draw caricatures, and she drew mine as you can see at the left. I’m looking forward to more presentations tomorrow.

  • Using the MySQL stack trace to isolate bugs
    I came across an interesting error reported on #mysql the other day. When I went through it with the reporter it looks like we uncovered up to two bugs in InnoDB (or rather XtraDB as it was Percona Server). I thought it might be useful to go through the error message, including the stack trace, to show that you don't need to be a developer to track down some useful information. read more

  • Collaborate — Day 1
    The MySQL sessions at Collaborate started strong after an amazing keynote by former astronaut Mark Kelly about working to become a naval aviator, astronaut, and helping his wife — Congresswoman Gabrielle Giffords — after an assassination attempt on her life last year. A rare moment when the Oracle demo pods where not wall to wall people. First up was Set up MySQL in Five Minutes by Bob Burgess of Radian6. Most of the attendees to these sessions seem to be long time Oracle DBAs looking to add more MySQL skills or long time Oracle AND MySQL DBAs. Bob then had a second session and covered Shell Scripting for MySQL Administration where most of the crowd of twenty had already had lots of shell programming experience. BYU’s Dr. Mike McLaughin provided an excellent hour on Portable SQL between Oracle and MySQL that was filled with observations of both products. He also stressed features of each that were best avoided for portability. Then Lee Stigile presented MySQL High Availability Solutions and Case Studies. Lee compared using replication, DRBD, and MySQL Cluster while detailing where each shines best operationally. Administering MySQL for Oracle DBAs drew an audience of over thirty for Nelson Calero from the Uruguayan Oracle User Group. Nelson had been using Oracle for a decade and MySQL for five years and gave the audience some solid tips on MySQL mastery. Then it was booth duty time for the MySQL Community Team and for two hours we answered questions about MySQL and it seemed like all six thousand plus attendees were in the Oracle booth at one time. Day two will feature Challenges of Big Databases with MySQL, part two of Set up MySQL in Five Minutes, MySQL for Oracle or How to Speak MySQL for Beginners, Security Boot Camp: Security Around MySQL, and a sure to be controversial Demystifying MySQL for Oracle DBA and Developers wherin George Trujillo will make the case that all new MySQL databases should be virtualized.

  • A first look at linq4j
    This is a sneak peek of an exciting new data management technology. linq4j (short for "Language-Integrated Query for Java") is inspired by Microsoft's LINQ technology, previously only available on the .NET platform, and adapted for Java. (It also builds upon ideas I had in my earlier Saffron project.) I launched the linq4j project less than a week ago, but already you can do select, filter, join and groupBy operations on in-memory and SQL data. In this demo, I write and execute sample code against the working system, and explain the differences between the key interfaces Iterable, Enumerable, and Queryable. For those of you who want to get a closer look at the real code, here's one of the queries shown in the demo: DatabaseProvider provider =    new DatabaseProvider(Helper.MYSQL_DATA_SOURCE);provider.emps    .where(        new Predicate1<Employee>() {            public boolean apply(Employee v1) {                return v1.manager;            }        })    .join(        provider.depts,        new Function1<Employee, Integer>() {            public Integer apply(Employee a0) {                return a0.deptno;            }        },        new Function1<Department, Integer>() {            public Integer apply(Department a0) {                return a0.deptno;            }        },        new Function2<Employee, Department, String>() {            public String apply(Employee v1,                                Department v2) {                return v1.name + " works in " + v2.name;            }        }    )    .foreach(        new Function1<String, Void>() {            public Void apply(String a0) {                System.out.println(a0);                return null;            }        }    ); and here is its (not yet implemented) sugared syntax: List<String> strings =    from emp in provider.emps,        join dept in provider.depts on emp.deptno == dept.deptno    where emp.manager    orderBy emp.name    select emp.name + " works in " + dept.name; For more information, visit the linq4j project's home page.

  • It's alive!
    LinkedIn has what they call "inDays" where employees may so something interesting which may not be directly related to their day job. I spent my inDay by porting my old WL820 project (External Language Stored Procedures) to MariaDB 5.3. The code, as usual, is available on LaunchPad ... To get the branch, simply do: bzr branch lp:~atcurtis/maria/5.3-wl820 The test cases pass... I haven't tested

  • #CLV12 Day 1
    A good day for MySQL at COLLABORATE 12. Most of the sessions had good attendance.  Dave and I also got the pleasure of meeting Michael McLaughlin of BYU. He is also an Oracle ACE in Database App Development.Michael is a support of Oracle and MySQL and I have just recently added his blog to the Planet site.

2012 Essential Links [http://ess-links.com] All Supports are wellcomed.