summaryrefslogtreecommitdiffstats
path: root/mirrors/utils.py
AgeCommit message (Collapse)AuthorFilesLines
2014-10-21Fix 500 when no URLs have been checkedrelease_2014-10-21Dan McGee1-2/+2
Signed-off-by: Dan McGee <dan@archlinux.org>
2014-10-21Move caching of function data back to get_mirror_statusesDan McGee1-4/+3
We've moved this around a few times, including changing the parameters to ensure they are stable (commit bdfa22500f4). However, the bulk of the work takes place in the mashing up of the data, so cache the full result rather than just the result of a single query. Signed-off-by: Dan McGee <dan@archlinux.org>
2014-10-21Simplify/clean-up finding of download mirrorDan McGee1-9/+7
Signed-off-by: Dan McGee <dan@archlinux.org>
2014-10-21Reduce complexity of status data URL queryDan McGee1-9/+5
Get rid of all the junk trying to only return URLs that have been checked in the last 24 hours; it just isn't worth it. Instead, do that screening only in the views that need it, namely the HTML status page. Signed-off-by: Dan McGee <dan@archlinux.org>
2014-10-21Small performance tweaks to mirror status JSON encodingDan McGee1-16/+14
Do a few things to speed up the encoding of the JSON, including better usage of list comprehensions, less dynamic setattr() usage, and removal of the queryset specialization since we can easily do it outside of the encoder. Signed-off-by: Dan McGee <dan@archlinux.org>
2014-02-22Upgrade django-countries to 2.0Dan McGee1-1/+1
Signed-off-by: Dan McGee <dan@archlinux.org>
2013-12-23Set all attributes to default values on status URL fetchDan McGee1-0/+2
We were missing two duration-related attributes here, causing some 500 errors to happen if we had cached status_data around that didn't agree with our current list of checked mirrors. Don't blow up on the JSON data fetch by ensuring we provide a value, even if it is out of date. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-12-14Use stable parameters for cacheable functionDan McGee1-2/+3
It doesn't do much good to mark a function as cacheable if we call it every single time with different arguments due to using the current date and time. Fix it by passing the offset in instead. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-12-14Show all mirror status data to authorized usersDan McGee1-9/+11
Regardless of whether the mirror URL is active or not, we often have data we can show the end user, especially if mirror admins care to see the data we've been gathering. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-07-13Fix completion percentage calculation in mirror statusDan McGee1-2/+2
We sometimes record a duration even on a failed fetch attempt, such as if we get an HTTP 404. However, we never record a last_sync value on a failed fetch. Use this field instead to sum up the total number of successful checks. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-06-20Re-enable caching for somewhat expensive mirror status queryDan McGee1-1/+1
This should be a small enough chunk of data that it isn't super expensive to put into and pull out of memcached. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-05-31Honor mirror URL active attribute in several placesDan McGee1-3/+7
Signed-off-by: Dan McGee <dan@archlinux.org>
2013-04-20Fix some None issues with sqlite3 and mirror statusDan McGee1-2/+4
If certain attributes came back from the database as NULL, we had issues parsing them. Pass None/NULL straight through rather than trying to type-convert. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-04-16Various minor code cleanups and fixesrelease_2013-04-16Dan McGee1-3/+3
Most of these were suggested by PyCharm, and include everything from little syntax issues and other bad smells to dead or bad code. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-04-14Remove cache_function decorator from a few spotsDan McGee1-2/+0
The benefit of these storage operations might be outweighed by the cost, especially given how infrequently these functions are called. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-04-14Reduce mirror status query madnessDan McGee1-55/+91
Move completely to custom SQL for this logic. The Django ORM just doesn't play nice with the kind of query we are looking to do, so it is easier to do using raw SQL. The biggest pain factor here is in supporting sqlite as it doesn't have nearly the capabilities in handling datetime types directly in the database, as well as having some different type conversion necessities. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-04-13Support only a single mirror ID in error/status retrievalDan McGee1-9/+9
This simplifies things and makes injecting this single mirror ID into custom SQL a whole lot easier. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-04-13Calculate average URL delay in the databaseDan McGee1-19/+30
Rather than doing this in the Python code and needing 12,000+ rows returned from the database, we can do it in the database and get fewer than 300 rows back. If I recall correctly, the reason this was not done originally was due to our usage of MySQL and some really bad date math/overflow stuff it did when the interval between last_sync and check_time were greater than about a week. Luckily, we have switched to using a more sane database. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-01-14Drop country column from mirror tableDan McGee1-4/+3
We now always look for this information at the URL level, not the mirror level. This simplifies quite a bit of code in and around the mirror views. Signed-off-by: Dan McGee <dan@archlinux.org>
2013-01-12Round two of mirror status query improvementsDan McGee1-9/+10
This seems to generate much more performant queries at the database level than what we were previously doing, and also doesn't show duplicate rows. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-11-21Fix mirror URL duplication in status viewDan McGee1-1/+2
We need to ensure we don't duplicate URLs in the status view, so add a distinct() call back in to the queryset when it was inadvertently dropped in commit a2cfa7edbb. This negates a lot of the performance gains we had, unfortunately, so it looks like a nested subquery might be more efficient. Disappointing the planner can't do this for us. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-11-16Optimize mirror status data fetchingDan McGee1-8/+16
Now that we have as many mirror URLs as we do, we can do a better job fetching and aggregating this data. The prior method resulted in a rather unwieldy query being pushed down to the database with a horrendously long GROUP BY clause. Instead of trying to group by everything at once so we can retrieve mirror URL info at the same time, separate the two queries- one for getting URL performance data, one for the qualitative data. The impetus behind fixing this is the PostgreSQL slow query log in production; this currently shows up the most of any queries we run in the system. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-11-11Mirror graph tweaking after usage with real dataDan McGee1-1/+2
* Clamp y-axis minimum to 0. * Don't plot `is_success == false` values. * Ensure URLs are sorted predictably. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-11-10Allow filtering retrieved mirror statuses by mirror_idDan McGee1-4/+15
When we don't need them all, no need to fetch them all. Let the database do the work for us, hopefully. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-10-10Make mirror log time query a bit more efficientDan McGee1-4/+6
We don't need the full mirror log objects; we just need a very small subset of values from them here to do the required math and object building. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-07-24Remove custom utc_now() function, use django.utils.timezone.now()Dan McGee1-8/+9
This was around from the time when we handled timezones sanely and Django did not; now that we are on 1.4 we no longer need our own code to handle this. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-07-08Correctly reassign queryset with added annotation in mirror statusrelease_2012-07-08Dan McGee1-1/+1
This was a dumb oversight on my part in commit 0f3c894e7a0. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-07-08Don't include StdDev on sqlite3 mirror status queryDan McGee1-3/+9
Because this function isn't shipped by default, it makes more sense to just omit it completely from the query we do to build the tables on this page when in development. Substitute 0.0 for the value so the rest of the calculations and display work as expected. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-05-13Add ability to restrict status report to single tierrelease_2012-05-13Dan McGee1-1/+1
This should make it easier to catch errors in our Tier 1 mirrors. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-05-13Don't limit protocols returned by mirror status functionDan McGee1-2/+0
If results weren't available for certain URLs, they won't show up anyway in this list, and if we start to check rsync URLs, then we want their values to come back in this status list. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-04-25Ensure sorted order of mirrors in status page matches with JSrelease_2012-04-25Dan McGee1-3/+2
We had one sorting order in the backend, and another once the JS sorting routine kicked in. Match them so we aren't doing more on the client-side on initial display than we have to. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-04-25Finish django countries implementationDan McGee1-8/+9
* Add a migration to drop the old countries field. * Update all templates/views/utility methods to point at the new country field and dereference it as necessary. * Add the flags images to a few views where it makes sense. * Cleanup the download page layout quite a bit. * Bump the mirror status JSON version to 3; add country_code attribute. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-04-25Rename mirror country fields to country_old in prep for normalizationDan McGee1-4/+4
We're going to move to using ISO 2 character codes via django countries, so start by moving the old data out of the way first. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-04-08Don't blow up when no mirror status data is availableDan McGee1-1/+1
The check here was wrong before; in the case of no mirror log entries the returned value will not be empty, but will contain two empty values. Check the values instead to see if we have valid data available. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-04-07Choose an up-to-date mirror for download URLsDan McGee1-0/+34
Given that we collect a lot of mirror status data, we can utilize it to ensure the download link on the website actually works and newly-added packages have actually been mirrored out. Add a method that attempts to use the mirror status data to determine a mirror we should redirect our download requests to. This can change on a regular basis, and falls back to the old method if no mirror status data is available. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-03-23Make all datetime objects fully timezone awareDan McGee1-6/+7
This is most of the transition to Django 1.4 `USE_TZ = True`. We need to ensure we don't mix aware and non-aware datetime objects when dealing with datetimes in the code. Add a utc_now() helper method that we can use most places, and ensure there is always a timezone attached when necessary. Signed-off-by: Dan McGee <dan@archlinux.org>
2012-01-05Adjust page and content caching lengths and decoratorsDan McGee1-2/+2
Remove never_cache from many places now that we don't actually need it since we aren't caching by default. Adjust our cache_function decorator times be shorter values, and also randomize them a bit to make cache invalidations not all line up. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-12-11Switch back to using standard deviation in mirror check pageDan McGee1-2/+1
This got checked in by default, whoops. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-11-10Add package signoffs JSON viewDan McGee1-1/+2
This allows access to the same data (and even a bit more) from the signoffs overview page in a machine-friendly way. Signed-off-by: Dan McGee <dan@archlinux.org>
2011-04-18mirrors: pylint discovered cleanupsDan McGee1-19/+23
Signed-off-by: Dan McGee <dan@archlinux.org>
2011-04-12Add optional country override for individual mirror URLsDan McGee1-3/+6
This allows a named top-level mirror to have geographically distributed URLs, e.g. kernel.org and the geo-DNS setup. Signed-off-by: Dan McGee <dan@archlinux.org>
2010-12-12Use check count for this URL, not max of all mirrorsDan McGee1-1/+1
Prevents a recently enabled mirror from getting unfairly represented as far as completion percentage goes. Signed-off-by: Dan McGee <dan@archlinux.org>
2010-10-07Factor check completion pct into mirror scorerelease_2010-10-07Dan McGee1-2/+7
Use it as the divisor in our slightly longer equation. Signed-off-by: Dan McGee <dan@archlinux.org>
2010-10-01Fix an off by one error in math for check intervalDan McGee1-1/+5
Because we are averaging the interval and not the value, we need to subtract one from the total we are dividing by. Whoops. Signed-off-by: Dan McGee <dan@archlinux.org>
2010-09-30Use new is_download fieldDan McGee1-1/+1
Signed-off-by: Dan McGee <dan@archlinux.org>
2010-09-30Mirror status improvementsDan McGee1-12/+17
* Fix sorting issues. '', 'unknown', and '∞' should now always sort after anything else in the list. * Add a completion percentage column; this will tell you at a glance if a mirror is sometimes unresponsive. This should probably be incorporated into the mirror score. * Make a few more things dynamic in the template, like the time back the page reflects. * Add some additional template tags for formatting things. Signed-off-by: Dan McGee <dan@archlinux.org>
2010-09-24Give more information about mirror check runs and frequencyDan McGee1-5/+24
Show how many times the check has ran in the last 24 hours, as well as the average interval between checks. Signed-off-by: Dan McGee <dan@archlinux.org>
2010-09-24Mirror status query refinementsDan McGee1-4/+4
Only show errors for active and public mirrors, and collapse two filter calls into just one for our normal status query. Signed-off-by: Dan McGee <dan@archlinux.org>
2010-09-22Switch mirror status delay display to average delayrelease_2010-09-22Dan McGee1-6/+19
This takes a bit more work to compute, but since we cache all of this anyway it isn't too big of deal. Using average delay instead of last delay will be a bit more fair on mirrors that have odd syncing schedules, as well as exposing those that only sync once a day. Also fix an issue that will arise with cutoff_time being calculated once, and adjust mirror score to treat hours delay as a float rather than an integer. Signed-off-by: Dan McGee <dan@archlinux.org>
2010-09-21Allow caching of mirror status infoDan McGee1-0/+45
Signed-off-by: Dan McGee <dan@archlinux.org>