Support

Akeeba Backup for Joomla!

#31976 Akeeba not transferring all parts to S3 but not receiving any errors

Posted in ‘Akeeba Backup for Joomla! 4 & 5’
This is a public ticket

Everybody will be able to see its contents. Do not include usernames, passwords or any other sensitive information.

Environment Information

Joomla! version
n/a
PHP version
n/a
Akeeba Backup version
n/a

Latest post by on Sunday, 15 December 2019 17:17 CST

maestroc
I have a large 150gig site to backup to S3. I have the process parts immediately turned on and the part size set to 1gb. When I run the backup it appears to complete normally (other than a warning about the process parts thing being turned on). However when I go to check the S3 account only 24 of the 100+ archive parts are there. When I check my local folder there are over 100 parts remaining to be transferred.

If I click the Transfer Archive button for that backup set a popup comes up saying that it is continuing the upload then counts up to 137 and says it is complete but my Amazon bucket says otherwise. It is missing over 100 of the parts.

Log file is attached. Any ideas?

nicholas
Akeeba Staff
Manager
I can tell you what is going on. The "process each part immediately" feature was written before Amazon S3 gave us the ability to upload a file in smaller chunks. The concept of that feature was that at the beginning of each file processing step we'd check if there is a backup archive part that's finished and upload it all at once. Therefore the next time we came back to put more files in the backup archive we had a new part.

This does not work well with engine that upload the backup archives in chunks and require a lot of calls to the backup upload method before the part is actually uploaded. That's why in your log file you see some files being added to a part then we go back to uploading another chunk of the previous part and so on and so forth. This is also why you get the warning about using this option.

I am not entirely convinced that this can be fixed without having side-effects. It's something I am investigating for the upcoming Akeeba Backup 7. For now, my recommendation is to disable the process each part immediately feature.

Based on the log file, I see that the bulk of your data is sound files. You can create two backup profiles. One full site backup which excludes the sound files. The second one am incremental, files only backup profile which includes only the sound files. This would be a far more efficient method to backup your site.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

maestroc
The main reason I had enabled the Process Each Part Immediately option was due to storage space. We have a 500gig drive with 150gigs of data. When I tried it the first time without the Process Immediately turned on it maxed out the disk space and I had to SSH in there to delete stuff to be able to access the site again. Is there a way around this part of the problem?

nicholas
Akeeba Staff
Manager
I have found a way to fix it, I am currently writing the code for it. It will take me a day or two, bearing in mind that it's the weekend and I only get to write code when our child is asleep.

The catch is that the fix can only be applied in the new, refactored version of the backup engine I am preparing for Akeeba Backup 7. It cannot be applied to the backup engine of Akeeba Backup 6.

Are you willing to use a beta version of Akeeba Backup 7? It should actually be very stable. I've been using it for a month. If not, you'll have to wait for the stable 7.0.0 release. My plan is to release a public beta next week and another one in the beginning December. Barring any surprise issues a stable should follow the second week of January, when everyone is back from holidays.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

maestroc
Sure, I'll try it. For the record I went ahead and switched things around as you suggested, made the normal full site backup then an incremental backup for the songs folder. The normal one backed up fine to S3 but it still won't send the large incremental one even with the process files immediately turned off. I've attached the log file for this latest incremental attempt in case you need it.

maestroc
Here is the log.

nicholas
Akeeba Staff
Manager
To be clear, I already said that chunked uploads and "Immediately upload each part" will not work together correctly in Akeeba Backup versions before 7.0.0. As I said, it is an architectural issue that could only be fixed after doing several rounds of refactoring which eventually culminate in version 7 which, at the time of this writing, is not yet released.

The type of backup is irrelevant to this issue happening with the currently released versions of Akeeba Backup. My suggestion for a more efficient way to back up your site was made on the premise that "Immediately upload each part" would be disabled.

However, this is a moot point if you're willing to test a pre-release version of Akeeba Backup 7. Please give version 7.0.0.a1 a try. I have already tested it with a big, multi-part backup being uploaded to Amazon S3 in multiple chunks. Version 6.6.1 failed to work properly, as you reported. Version 7.0.0.a1 worked. If possible, please test your backups. I have not seen any issues restoring the backups but I'd appreciate feedback from more people except yours truly and my associates :)

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

maestroc
I downloaded and installed the new version and gave it a whirl. The first time I didn't time it right and the system only had enough time to transfer 30 of the 126 archive files before I had to close my laptop and leave. I deleted the leftover archives and restarted it when I got home so hopefully it will complete normally.

Some initial thoughts to share:
1) I like your change to the % completed thing. It seems to be much more accurate from what I saw so far.
2) I can't figure out how to enable front end backups. The switch isn't in the options panel any more but the remote backup word still is. Did it get moved somewhere? I use Akeeba along with Watchful.li so was hoping to connect the two for backup purposes. I realize this is an alpha so might not be in there yet but thought I'd mention it.
3) Wasn't there an option somewhere to finish uploading files to remote storage at one point? I could be totally wrong on this but I thought one time I remembered the system noticing that the upload hadn't completed and offered to finish the upload?

nicholas
Akeeba Staff
Manager
1. I have not touched the percentage calculation at all :) There is not much you can do with it, really. You don't know how much stuff you have to backup until you have completed the backup. The percentage is an approximation based on the number of tables in the database and folders in the root of your site.

2. Go to the Scheduling Information page. It tells you what to do. In version 7 we have moved the front-end legacy backup feature and the JSON backup API into separate plugins, available only in the Professional release. They are disabled by default. I believe the tooltip of the frontend backup secret word also tells you the same. You need to enable the JSON API plugin to connect your site to Watchful.

3. It is still there. Manage Backups page, right hand column. The Manage Backups page has not been touched in this release.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

maestroc
I tried the backup again over night. When I went to bed it was at the point where it said it was checking archive integrity (I had turned off the process each part immediately setting). This morning it appears something went wrong.

First, the log file for this job is over 120gb in size. If you really want it I can try to move it to a public folder but would prefer to send the link privately instead of in a public ticket.

As for the archive itself it looks like it made the right amount of archive parts but some of them are missing from the local folder. Parts 2-6 are not in the local folder and not on S3 either. The only files from this run that made it to S3 are the jpa (496MB) and the j01 (27 bytes).

When I clicked the transfer button in the manage backups page it gave the error that

Upload of your archive failed.
Cannot open /home/admin/domains/.com/public_html/administrator/components/com_akeeba/backup/site-.com-20191111-224003utc.j02 for reading

nicholas
Akeeba Staff
Manager
Disable archive integrity checks. It will always fail when you use the Upload Each Part Immediately. This is documented. Moreover, this is what is triggering your problem with missing parts.

I will make a change in Akeeba Backup to automatically disable the archive integrity checks when upload each part immediately is enabled.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

nicholas
Akeeba Staff
Manager
Wait, something is not right in what you are saying. We are already disabling the backup archive integrity check when Upload Each Part Immediately is enabled exactly because it would cause the post-processing to fail. This means that your description of the issue does not match what the code is actually doing. I think that what you perceived as testing the backup was, in fact, the last step that successfully completed. The next step after that is uploading the rest of the backup archive parts which could not be uploaded during the backup process i.e. the very first (.j01) and very last (.jpa) part. So your problem has nothing to do with the archive integrity check and something to do with the code that uploads the parts during the backup process.

At this point I am not sure if it's an issue with our code or something went sideways with Amazon S3. I need the log file to understand what is really happening here. Please ZIP and attach the log file from the latest failed backup attempt.

Thank you!

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

maestroc
My apologies. I was tired/confused/frustrated/etc. when I wrote that other message and misspoke about some details. I think it was me that messed things up on that archive attempt. Please disregard that one as honestly at this point I don't know what happened or how I had it set. I think it might have been partially due to the fact that apparently I had set the computer to sleep after 5 hours instead of never. I had to delete those backup files and the mega-log file to free up space so unfortunately I can't send the log but based on your previous message things were working as designed, the failure in that case was mostly my error it would appear.

Last night went a lot better though. I gave it another try, making sure that the computer was set to never sleep and Akeeba had the upload immediately setting turned off and using the Files Only setting instead of the incremental. This time when I checked in the morning it had properly uploaded the whole batch and the manager reported a 136gb upload to S3. All of the files were on S3 and also properly deleted from the local drive so all good there.

Tonight I am going to try it again, but will change it to do the incremental backup instead and see if that works as well.

Question- I know jpa is the preferred format, but since this is such a huge static folder of audio files will it make any difference if I set it to use the ZIP option instead of JPA? Just thinking about it for down the line if I ever need to restore. I know you said in one documentation page that to use incremental all of the parts must be present for the restore to work. Was wondering if I use the zip format if I would be able to restore things even if for some reason one or two of the files were corrupted or lost?

nicholas
Akeeba Staff
Manager
apparently I had set the computer to sleep after 5 hours instead of never


That was what I thought was the most likely reason since you said you left the computer unattended. Based on the rest of your reply this seems to have indeed been the case.

will it make any difference if I set it to use the ZIP option instead of JPA?


No, it does not make much of a difference. The ZIP format does require storing file checksums. Most PHP versions have a very fast function called hash_file which allows us to calculate and store the correct checksums. If that function is disabled by your host we have to read the entire file in memory to calculate its checksum. This can't happen (we'd run out of PHP memory) so instead we store an invalid checksum for files over 10MB and only on these problematic hosts. If this happens, when you are extracting the ZIP file with something other than Kickstart you might get a notice that the file is corrupt. If you can tell your ZIP software to ignore these error you will have no problem.

This used to be a massive issue, not so much anymore. That's why we still use JPA instead of ZIP as the preferred unencrypted backup file format.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

nicholas
Akeeba Staff
Manager
Some more notes about what you wrote towards the end of your previous reply.

I know you said in one documentation page that to use incremental all of the parts must be present for the restore to work. Was wondering if I use the zip format if I would be able to restore things even if for some reason one or two of the files were corrupted or lost


One has nothing to do with the other.

If you use incremental backups each subsequent backup only contains files which were created or modified after the time of the last backup. Therefore you need to restore all backups (NOT parts, I am talking entire backups) to get your entire site back.

Regarding split files, it does not matter if it's ZIP, JPA or JPS. A multipart archive of any format requires all parts to be extracted. Think of multipart archives as a novel consisting of many chapters, let's say 50 chapters. You can't pick up the book and start reading from chapter 5; you'll have no idea what is going on. Likewise, if chapters 5 to 10 are missing and you try to read the book you'll have no idea what is going on. After chapter 4 you'd have to go to 11 and now you've literally lost the plot. You can't really read the chapters you have at hand and somehow magically write the content of the missing chapters. It doesn't matter if the book is written in American English or British English either. You still can't read the book without all the chapters.

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

maestroc
2. Go to the Scheduling Information page. It tells you what to do. In version 7 we have moved the front-end legacy backup feature and the JSON backup API into separate plugins, available only in the Professional release. They are disabled by default. I believe the tooltip of the frontend backup secret word also tells you the same. You need to enable the JSON API plugin to connect your site to Watchful.


I checked and both the legacy backup and the Akeeba JSON plugins are published and my secret code has been set. I copied that secret code over to watchful.li but it keeps giving me: ERROR: Access Denied: Please enable Frontend Backup in your Akeeba backup settings. Am I missing something else here?

nicholas
Akeeba Staff
Manager
Yes, there is a bug in the alpha I told you to use. Please use the developer's release here: https://www.akeebabackup.com/download/developer-releases/akeebapro/rev7186841c.html

Nicholas K. Dionysopoulos

Lead Developer and Director

πŸ‡¬πŸ‡·Greek: native πŸ‡¬πŸ‡§English: excellent πŸ‡«πŸ‡·French: basic β€’ πŸ• My time zone is Europe / Athens
Please keep in mind my timezone and cultural differences when reading my replies. Thank you!

System Task
system
This ticket has been automatically closed. All tickets which have been inactive for a long time are automatically closed. If you believe that this ticket was closed in error, please contact us.

Support Information

Working hours: We are open Monday to Friday, 9am to 7pm Cyprus timezone (EET / EEST). Support is provided by the same developers writing the software, all of which live in Europe. You can still file tickets outside of our working hours, but we cannot respond to them until we're back at the office.

Support policy: We would like to kindly inform you that when using our support you have already agreed to the Support Policy which is part of our Terms of Service. Thank you for your understanding and for helping us help you!