I lived in a range of places as a kid, partly because my father was a bit of an itinerant who didn’t know what he wanted in life, other than that I mustn’t live with my mother. Go figure.
Eventually I got to settle down with my grandmother, but in the process I learned a lot about life as a child in different places. Where I felt safe and where I did not.
I did not feel safe in large council estates surrounding cities. I did feel safe in a caravan park. I did not feel safe in a city centre. I did feel safe in a built up part of a large city, living in an apartment block.
The Heroku platform is an absolutely fantastic way to have to not bother with devops within a small development company. We’ve been using it at interconnect for years now, and whilst it’s not entirely perfect, it takes away one set of headaches and does so at a reasonable cost.
All the services offer backups, and the VMs are built from scripts and are essentially read only. So if something catastrophic happened to one of our databases, we can roll back a day and be OK. Except… let me explain my fears around data.
Trust issues with providers
In our very earliest years we used a VPS provider that used Plesk. Everything was solid and stable until one day, we got a report that a site had been hacked. Then another. It turned out that a vulnerability had exposed our sites to being hacked. And they were. This resulted in a big old clean up operation and restoration from backups. Except the daily backups we’d been paying for turned out to be weekly. So the backups we had were three days old. Ever since then, I’ve preferred to have a way of pulling backups separately to a server under my own control, unless the provider is Kumina, because I know the people so well that I’m 100% certain they’re as paranoid as I am and they’ve never ever let me down. But in the era of hustle culture bros who move fast and break things, you need a safety net.
Creeping corruption
My next fear is corruption you don’t notice immediately. I can well imagine that if all the meta data for the posts on a site before a certain date got wiped out, most people wouldn’t notice for ages. Imagine you’ve got a site with 200,000 posts, and various elements of the first 100,00 were damaged – the long tail matters to these sites and suddenly it’s all gone. Well, thank heavens for backups!
Except, of course, most cloud providers don’t provider substantial generational backups. Instead, they keep a few days or a week or so. And that’s your lot. If you need to go back months you’d better hope a developer in the company left a dump on their laptop somewhere – except of course that very very few developers keep dumps of production systems on their laptops – it’s bad practice and only tends to happen in exceptional circumstances and should be deleted soon after use.
How we fix it today
In the end, I asked one of my Linux oriented colleagues, Gianluigi, to create a service that would connect to Heroku’s API and then download every database, and sync every S3 bucket. It worked, with some limitations. More recently, because he’d left but remains a good friend, he helped me with a crash course in Linux sysadmin basics and I was able to extend and improve some bits. The system is a service written in PHP that does all the work. I then asked another colleague internally, Jack, to extend things to cover the PostgreSQL databases we also now used and to create a dashboard so that I could monitor the backups easily without resorting to logging into the backups servers.
The dashboard also doesn’t run on the backups servers. I needed to keep the backups as safe as possible – they’d be a great honeypot for a hacker, so they’re onioned away, and the backups service isn’t reachable from outside. Instead, it messages the dashboard with information about the backups taken. The dashboard also provides details on application and framework versions, for security monitoring and making sure updates have been applied appropriately, and it also sends me a daily summary email showing me storage space available and what was backed up in the previous 24 hours.
Here are a few screenshots of the system, with some censoring, but I hope you catch how it works from what you see.
To commercialise, or not?
And now to one of the reasons why I’ve decided to write about this. In the past, I created the first version of Search Replace DB – a quick script and algorithm I knocked up to parse a database and search and replace items in it. A fast, dangerous tool that I released as free open source code. Other people took it and commercialised it into successful products. We didn’t. And with the code being integrated into wp-cli and most devs would use that in preference (myself included!), except in those tricky situations where command line access wasn’t possible – mostly on cheap hosts. I think we were right to release the code, but where we failed was in realising the commercial possibilities. And that’s left me a little torn.
So now I’m torn – it’s not easy to set up services in Linux, but once you do, these things just run and run. It’s also not going to be the easiest thing to work with, so I anticipate support costs being quite high. It’s proper server level work. And I certainly don’t feel inclined to build a SaaS that acts as a conduit for people’s backups. It’s just too risky to have a central pool of lots and lots of backups, and people find them lurking on S3 buckets all the time. So I want to put this out to the community. Is this something you’d find useful? Let us know in the comments below. If we did release it, the code would be open source, but access to the latest versions would be restricted.
I went to WordCamp UK 2010 in Manchester… this is my write-up of the event, and its controversies along with my presentations…
I’m just settling in at the office having spent the weekend at WordCamp UK 2010 which was staged in Manchester and is a community event for WordPress users and developers. I gave two presentations, one about WordPress in Big Media, and another about WordPress in the Enterprise. These followed on from presentations given at last year’s WordCamp.
The Craic
I’m going to say now that one of the key elements of a good conference or unconference is the socialising – this is where you meet people, bond with them over beers/food/dancing and form alliances that in the future could prove to be very powerful. You certainly get to make friends and feel like you’re a part of an actual community, and this happens in a way that you’ll never be able to reproduce with online technology. As a consequence it’s no surprise that the awesome Thinking Digital conference has been nicknamed Drinking Digital by some wags.
As ever,Tony Scott excelled himself by getting us access to the famous Factory Manchester (FAC251) which also happens to be across the road from a magnificently geeky pub that sells good beers, has various classic 8 bit and 16 bit computers adorning the walls, and classic arcade games on free play. Awesome.
The Presentations
There was a typically varied range of presentations running across three rooms, along with other folk busy coding up for the WordHack (the fruits of their labours are online). One particular stream that particularly caught my attention was that of a sequence of involvement from John Adams of the Department for International Development. He ran a free-form discussion group on testing strategies which was followed by an interesting talk on PHP unit-testing Nikolay Bachiyski of GlotPress fame. This session showed up some of the lack of structure in general testing of WordPress core code, plugins and themes. Although the approaches used were probably fine for a publishing platform, they would struggle to gain ISO approval. In other words, you wouldn’t want to fly on a WordPress powered plane!
Other presentations that I particularly enjoyed were Michael Kimb Jones’s WOW plugins, and Toni Sant’s very underattended Sunday morning slot where he discused the way WP has helped with a range of Maltese websites.
The Controversy
What’s a WordCamp without at least a little controversy? However, for the attendees of this one, this was a biggie… Jane Wells is Automattic’s Master of Suggestion (seriously, that company has some weird job titles) and she made a suggestion that we shouldn’t have a WordCamp UK, but instead locally organised WordCamps for cities.
There’s a number of issues I have with this:
Everyone in the UK knows that quite quickly WordCamp London would be the big one with all the attention in both media and attendance. It would quickly dominate – in large helped by the enormous population density of the capital. A WordCamp UK in London would be fine and popular (also considerably more expensive) but that’s all that’s needed.
Many British cities have intense rivalries whilst we all still stand together as a nation – there are folk in Glasgow who would never attend a WordCamp Edinburgh, but would definitely be more interested in a WordCamp Scotland. End result? Cities would have small attendances by and large, and our impressive capacity for indifference for minor events would mean that they’d end up as little more than tiny, cliquey gatherings. Anyone who’s tried to run GeekUps will understand this problem.
A lot of work, energy and our own money has been spent on building up WordCamp UK. Is Jane seriously suggesting we should dump that?
What is Jane’s authority on this? She’s simply an Automattic employee. We chose WordCamp UK and its structure – it’s ours. If someone else wants to run a WordCamp UK in the country they’re perfectly entitled and there’s no real reason why we couldn’t have three or four running each year – that would be a huge success. A highly capitalistic organisation that is just one of thousands of contributors to the project and which plays no part in actually running most WordCamps shouldn’t get so involved.
The UK is also very small – 90% of the population can reach all past WordCamp UKs in less than 3hrs – there is no real problem about accessibility.
None of the UK’s key WordPress community members want to give up WordCamp UK.
Jane admitted only six or seven people had complained to her about the situation, two of which turned out to be in Ireland – which except for a small part isn’t in the UK at all. She couldn’t confirm whether they were Northern Irish or not, which was actually something of a poor mistake to make in front of 150 or so Brits.
Us Brits are a pretty apathetic bunch at the best of times – actually running a WordCamp in each major city would be surprisingly unlikely to happen – there were only two bids submitted for this year’s event – one in Portsmouth and one in Manchester.
The whole point of the *camp suffix is that it’s all free and easy with no big organisations sticking their oar in. They are inconsistent and joyful. They’re fun. Automattic should keep out.
The WordCamp name is not trademarked, and we’ve been using it in the UK for some time now. It’s ours!
Of course, there are two sides to each argument. Here’s some reasons and benefits to splitting up WordCamps in the UK:
If somebody wished to run a WordCamp for their city they may feel that the UK badge is dominating and there’d be little interest as a consequence if it was called WordCamp Bristol, or WordCamp Salford.
A national event called something like WordConf could happen.
Erm…
Thing is – we can’t necessarily win this battle here in Britain. We don’t control the WordCamp.org website – Matt Mullenweg does (he has the domain registration in his name) so if we fight to keep calling it WordCamp UK there’ll be no ongoing support for the event from Matt and his team if they wish to stop the use of the UK moniker.
Which would mean standing up to them. Do we want to? Are we prepared for a fight on this? What do the likes of Mike Little (co-founder of the WordPress project) and Peter Westwood (a UK based core developer) feel about this?
Interestingly we were told the same thing applies to the likes of WordCamp Ireland which will now face this problem – but I wonder if Matt understands Ireland particularly well (we know Jane doesn’t) and that in that country the dominant WordCamp would quickly become an expensive Dublin event. You may get one doing well in Cork, but Kilkenny, with a population of just 22,000 and which staged this year’s event, probably wouldn’t be able to sustain an annual WordCamp.
So, Jane has to really allow each country to understand its own social constructs and history and let their own communities choose how they do things. One or two may complain, but it’s not possible to please everyone.
And we showed off too…
My company Interconnect IT have released, through our Spectacu.la brand, the following plugins which you may find useful:
I can’t say thank you enough to the people who make WordCamp UK a success for no personal reward. Tony Scott leads it up, with Mike Little, Nick Garner, Chi-chi Ekweozor, Simon Dickson and many many more working hard behind the scenes. Also to Nikolay to letting me play with the fastest 85mm lens I ever saw! Thank you, you’re wonderful people.
Ever needed to migrate a database to a new server or website (especially with WordPress and other PHP applications) and been stuck because when you do a search and replace some of the data seems to get corrupted?
Ever needed to migrate a database to a new server or website (especially with WordPress and other PHP applications) and been stuck because when you do a search and replace some of the data seems to get corrupted?
Serialized PHP Arrays Cause Problems
In PHP one of the easiest ways of storing an array in a database is to use the serialize function. Works a treat, but the downside is that you’re not storing data with a cross platform method. In many product development environments this would get you a stern talking to, but in the world of web development where deadlines are tight and betas are the norm, this seems to be overlooked somewhat.
So what we have are tables full of data that can’t be easily edited by hand. For example:
Say you had thousands of records like the one above, and the word ‘multiple’ needs to be changed to ‘happy’. Two bits would change – poll_multiplepolls would now read poll_happypolls and multiple_polls would read happy_polls. In both cases you would have three characters fewer to deal with.
Fine, you may think, but you can only do the change by hand because where it says s:18:"poll_multiplepolls" it now has to say s:15:"poll_happypolls" – see the difference? S18 spells out the length of the following string, and it has to be changed to s:15
I’ll say right now, that that was a pain. For simple arrays I wrote the straightforward PHP Serialization fixer code, which got me out of many a pickle – do the search and replace without worrying, and then run the script. Fixed about 90% of problems.
Multidimensional Array Problem
Sadly those 10% of problems left were a real pain. I needed something more robust. Something more powerful. And finally today it was a Bank Holiday in the UK – that means no phone calls… I could have a quiet day of coding and concentrate on the best solution to this problem.
What I’ve done is to write a database search and replace utility in PHP that scans through an entire database (so use with care!) which is designed for developers to use on database migrations. It’s definitely not what you’d call an end-user tool, though I may sanitize it at some point and turn it into an easy to use WordPress plugin. Thing is – this is dangerous code – sometimes I think it’s better to make it deliberately a bit tricky, don’t you?
It’s not that bad though – if you can manually install WordPress, you can easily configure the database connection settings.
What the code does is to look at the database, analyse the tables, columns and keys, and then starts reading through it. It will attempt to unserialize any data it finds, and if it succeeds it will modify that data then reserialize it and pop it back in the database where it found it. If it finds unserialized data it will still carry out the search and replace.
Use in WordPress
In most WordPress migrations you tend to have the primary problem of changing the domain name entries in content, settings and widgets – you simply need to put in the $search_for string the old domain address (including the http if it’s there) as seen on the database, and the new one into $replace_with. Then put this script onto your server, and run it by visiting it in your browser or inputting the appropriate command line – depending on your server configuration.
Other things you may want to check are for plugins or themes that have made the mistake of storing the full server path to the installation – cFormsII does this, for example. You will need to find out your old and new server paths and use those, in full, for another iteration of this script.
After less than a second of running, you should have a freshly edited database. It may take a little longer on slow or share hosting, or if you have a very large database, but on my laptop I can manage around 60,000 items of data per second.
I’ve just used the script to migrate, in its entirety, with content, settings, 87 widgets (yes, really!) and hundreds of images to my localhost server. It took moments, and the site is perfectly preserved.
BIG WARNING: I take no responsibility for what this code does to your data. Use it at your own risk. Test it. Be careful. OK? Here in the North we might describe the code as being as “Rough as a badger’s arse.” Never felt a badger’s arse, but I’ll take their word for it.
Serialization of data loaded into an SQL table is a dreadful thing and makes WordPress migrations harder than they should be, but it happens and so we must deal with it. I’ve knocked up a rough and ready bit of code which does its best to resolve the problem.
When you move a WordPress blog from one folder to another, or from one site to another, you normally use the export/import functionality.
This is fine for normal blogs, but say you’ve developed a new website and set it up on your local machine – the URL for the site may be something like http://localhost/devsite and the live URL will be something like https://davidcoveney.com – you won’t want to set up all the theme options, site options, plugin options and so on all over again.
Instead, a theoretically simple approach is to do a database dump, a search and replace for all references to server paths and URLs, and then reimport that data in the new location.
Should work, but it often falls apart.
What happens is that in WordPress, its themes and its plugins, a lot of data is stored using a method known as serialization. Now, in my opinion this breaks all known good practice around data – it’s language specific, it’s not relational even though it often could be, and it’s hard to edit by hand.
One particular problem is that if you change the length of the data in a serialised string you have to change the length declared in the generated string.
That’s very painful when you have hundreds of the fields.
So, because I’d found this painful I decided to knock together a quick application to at least reduce the amount of editing I had to do. You just do your search and replace, forget about the serialized string lengths, upload your data to the new database, and run this script.
Warning: I haven’t got it to work for widgets and cForms II yet, but the latter has some export functionality anyway, which takes that particular pain away if you plan ahead. In the meantime, feel free to play with the attached file. You use it at your own risk, of course.
To use it, download the file linked in this post, extract it, open the file, edit the connection settings, tell it the table you want to scan through, the column, and the unique key field. If you somehow manage to have more than one unique key to deal with (you shouldn’t, but then it surprises me what people manage to code up), then you’ll have to modify the code accordingly. Once done, make sure you have a backup of that table, and execute the php – either at the command line or through the browser. License is WTFPL, and if you’d like to improve the code, please do and I’ll host the new version.
BIG WARNING: I take no responsibility for what this code does to your data. Use it at your own risk. Test it. Be careful. OK? Here in the North we might describe the code as being as “Rough as a badger’s arse.” Never felt a badger’s arse, but I’ll take their word for it.