User talk:Kghbln/Archive 01

-u-a / -u+a / +u-a / bl (e)

Hello
Thanks for signing up and registering a website. I'm still actively building out WikiApiary but you will shortly see your statistics start to show up. Have fun and hope you find the Apiary useful! Thingles (talk) 20:13, December 28, 2012 (CST)


 * Heiya Jamie, I just saw that you added this site to the Community Wiki and thought I give it a try. A great idea and I am looking forward to how it develops. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 20:15, December 28, 2012 (CST)

Website Summaries
I saw you added summary/description text to some of the wikis you added. I've been thinking I should add that as a property for the website form. Seeing you add it makes me think that I definitely should! :-) BTW, thank you for adding more wikis. That is awesome! Thingles (talk) 07:41, December 29, 2012 (CST)


 * I really like this wiki. The idea of providing a brief description is from the smw community wiki. Having an extra field for this, including a related property is a good idea. In return you could remove the free text field. I am not so sure about the tags I should use and this is why I have not done is so far. Probably autocompletion on property values is the way to go here. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 07:52, December 29, 2012 (CST)


 * I've added the field to the form, added autocompletion and removed the free text area as you suggested (diff). Something is amiss with my autocompletion though. I've noticed this on another wiki of mine, need to debug what is happening there. Thingles (talk) 08:01, December 29, 2012 (CST)


 * Great! Looks good to me - approved! :) Why don't you just use "values from property" instead of "remote autocompletion"? Cheers --&#91;&#91;kgh&#93;&#93; (talk) 08:07, December 29, 2012 (CST)

Indexes on Wikis
I'm planning on adding a number of calculated indexes to WikiApiary shortly. Calculations using the existing statistics that would be interesting. One example I've been considering is an activity index that would be (edits in last 28 days/active users). Another would be admin index (active users/admins). Any suggestions you have on indexes would be great. Thingles (talk) 08:07, December 29, 2012 (CST)


 * You will see I made a Template:Website indexes that is now showing four calculated indexes for all wikis. I also added a table of "Most Active Wikis" to the Statistics page. Pretty fun! Thingles (talk) 08:24, December 31, 2012 (CST)


 * Yay, we two beat Wikidata :) Its getting better here by the day! --&#91;&#91;kgh&#93;&#93; (talk) 08:28, December 31, 2012 (CST)

Editor
I created a new Editors group and have added you to it so that I don't need to patrol your edits. :-) And also so that at some point Forms and Templates can be restricted but you can still edit. Any suggestions on other rights or groups that should exist would be very welcome. Thingles (talk) 06:32, December 30, 2012 (CST)


 * Great, thank you. :) One thing that comes to my mind immediately is . I will provide other suggestions along my way here. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 06:36, December 30, 2012 (CST)


 * Added! Thingles (talk) 06:40, December 30, 2012 (CST)


 * What about  for operators? Cheers --&#91;&#91;kgh&#93;&#93; (talk) 20:13, 21 January 2013 (UTC)


 * Good idea. Done. Special:ListGroupRights. Thingles (talk) 20:37, 21 January 2013 (UTC)

Changing subobjects
Just a heads up on a big change I'm making right now. Right now the bot writes extension and general information to /Extensions and /General subpages of a given website. The subobjects are then attached to that subpage with a linking Has website property. I'm changing this after doing some tests. I'm going to still write the subpages, but the data will be in  blocks. Then I will transclude the subpages into the website page.

This will have two wins. First, queries will be easier to create since all data will be in the website object or subobjects directly connected to it. Also, since a transclusion is in place when the bot changes the subpage it will trigger MediaWiki to update the page with the transclusion and get new data. This does mean some queries will have to be modified. Thingles (talk) 14:55, December 30, 2012 (CST)


 * I am touching wood. Support from me. :) Modifying the queries should not be the problem then. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 15:10, December 30, 2012 (CST)


 * Change is made and should be propagated through. When you look at the properties for a website you'll see all the data now there. Thingles (talk) 15:19, December 30, 2012 (CST)

Header tabs
Funny! I was thinking that adding header tabs to the website template would be a good thing! See you just did that! Sweet. I'll pull the graphs out of the tabs though. I'm signing off for a bit. Youngest is done with nap and we're hanging out. Thingles (talk) 15:22, December 30, 2012 (CST)


 * I guess this was natural thing to do. :) Admittedly, I have to do something else, too. Enjoy your family time now! --&#91;&#91;kgh&#93;&#93; (talk) 15:26, December 30, 2012 (CST)

Sync with Semantic MediaWiki Community Wiki
I was looking briefly at the SMW Community Wiki on Referata and I think it would be pretty simple for me to write a bot that "synced" the websites registered there over to WikiApiary. Do you think this would be a good idea? Bad idea? Inappropriate? Cool? What do you think? In about 15 minutes I already have the query for pulling all the public, active sites from that wiki. Now I just need to make sure to not duplicate and attempt to discover the API endpoint my retrieving the base page and looking at the meta headers. Thoughts? Thingles (talk) 22:31, January 4, 2013 (CST)


 * A natural thing to think about this. Still I would not do it without asking Yaron about his thoughts, since he basically build up and fostered this directory. Otherwise it would appear like a bear taking a yummy honeypot. To some more or less large extent this wiki would supersede the community wiki after this was done. --&#91;&#91;kgh&#93;&#93; (talk) 05:13, January 5, 2013 (CST)


 * Thanks for the feedback. I agree, and I need to focus on getting the collector right before a bunch of hundreds of additional sites are added. Thanks! Thingles (talk) 14:59, January 6, 2013 (CST)


 * Once I get the rewrite of Bumble Bee (below) done and am actually honoring the Check every property I plan on publicly announcing this wiki on the SMW users mailing list. As it is, very few people know it exists. Thingles (talk) 15:05, January 6, 2013 (CST)


 * Still I would approach Yaron before announcing this wiki publicly since some people might see the similarity of both wikis and will ask about it. I am sure he will like this wiki anyway. --&#91;&#91;kgh&#93;&#93; (talk) 16:28, January 6, 2013 (CST)


 * Good idea. I don't know Yaron aside form some posts on the mailing list but when I get this bot stuff sorted I'll drop him a note, and CC you. Thingles (talk) 17:34, January 6, 2013 (CST)


 * Perfect! Thank you for doing this. --&#91;&#91;kgh&#93;&#93; (talk) 12:16, January 12, 2013 (CST)


 * Nice response from Yaron! Thingles (talk) 09:58, January 13, 2013 (CST)


 * Indeed. He is also a big fan of good and innovative ideas ... --&#91;&#91;kgh&#93;&#93; (talk) 10:05, January 13, 2013 (CST)

Bumble Bee on Github
Just FYI, I've started working on a rewrite of the three hacked together Python bots that collect information into one well written bot. If you want to see it there is a WikiApiary repo on Github.

Some highlights:


 * 1) When Bumble Bee cannot talk to a website (it's not responding, returns bad info) I plan on having it write a templated log message to the Talk page for that website. That make sense to you? Would there be a better place for it?
 * 2) Right now Bumble Bee isn't honoring the Check every setting. It pulls every 15 minutes regardless. This one will, but I'm going to move that from Check every hours to every minutes so that some sites (Wikipedia) can have 15 minute collection periods.
 * 3) I'm going to consolidate the three separate scripts that get stats, extensions and general info into this one.
 * 4) I had originally planning on pulling in the list of namespaces. You see that in the websites form. I'm reconsidering that. I'm not sure it adds any value unless I did a page count by namespace. That would be a bit harder than I want to do now, so just a list of Namespaces seems less useful.
 * 5) I'm totally excited about the Property:Has bot segment and running multiple Bumble Bee's at once. I've already got that in the rewrite. I could easily run 4 of these at a time and keep stats collecting very fast without having to write threading code.

Anyway, that is why you see a little less of edits in wikispace from me right now.

Thingles (talk) 15:04, January 6, 2013 (CST)


 * Thank you for keeping me in the loop. After you added the new property I found my way to GitHub. By the way, I am following you now. :)


 * …and I you! :-) Thingles (talk) 17:35, January 6, 2013 (CST)


 * ad 1) That's a good idea. This info belongs to the page since this is a page the admin of the wiki will most likely watch. Still adding semantics to the template for it to be also queried on a central maintenance page is favourable as well as the possibility to update this info within the template in a sense of "I worked on the problem successfully. Done."
 * ad 2) I was already wondering about these checks since my impression from looking at the graphs was that it is doing them much more frequently. My weblog is fully of bee activity too. :) So Bumble Bee is presently as busy as a bee can get. :D Going to minutes is good. Besides, since Bumble Bee is doing these checks on a regular basis anyway an information about the uptime of the respective wiki could probably easily be added to the website's page. I guess this will be very interesting at least for the admins too.


 * If I tried to add availability monitoring as well I would probably put that in a separate bot by itself and have that run every 5-minutes. Not in my plans now, but who knows. Thingles (talk) 17:39, January 6, 2013 (CST)


 * I see. Fair enough. Still an admin fave. :) --&#91;&#91;kgh&#93;&#93; (talk) 13:58, January 7, 2013 (CST)


 * ad 3) When doing this it would be great to have the new structure of the template in mind, i.e. to split up the General information so it may be easily allocated to the new sections. However, as I got to know you up till now, you have already concrete plans to cater for this. :)


 * I plan on leaving the templates as is, just having them all generated in the same Python as opposed to three separate scripts. Thingles (talk) 17:39, January 6, 2013 (CST)


 * Ah, we will have to see how to allocate the infos from the template. Will involve querying. --&#91;&#91;kgh&#93;&#93; (talk) 13:58, January 7, 2013 (CST)


 * ad 4) I was wondering about this one too. However, I did not put much thought into it since I regarded this as a less important information. It is indeed less helpful having the purpose of this website in mind.


 * Unfortunately I don't see any method in the Mediawiki API that allows me to ask for a count of pages in a given namespace. It looks like I would have to ask the wiki to give me the list of page names and then count them, which I think is too expensive. I can envision some crazy future where there is a WikiApiary Extension that could optionally be installed and could collect this type of information locally and present it in a nice API. That will go on the Roadmap. Thingles (talk) 17:39, January 6, 2013 (CST)


 * Probably only evolved wikis use more than the ones provided out of the box. In case you have extra namespaces you could set up  to have their pages included in regular statistics. Having an extra extension to do this will be an option, too. Not on the front burner tough. --&#91;&#91;kgh&#93;&#93; (talk) 13:58, January 7, 2013 (CST)


 * ad 5) A cool idea which will work out great!
 * Cheers --&#91;&#91;kgh&#93;&#93; (talk) 17:01, January 6, 2013 (CST)
 * PS I thought you might have another life with wife and kiddies and a job. No worries about your activity. :)
 * If there were just three of me it would be so much easier to get stuff done! :-) Thingles (talk) 17:39, January 6, 2013 (CST)

Wikipedia Jobs Graph
Check out the jobs graph for Wikipedia (en). Interesting build up. At some point it would be interesting to trigger a notification to admins when jobs are not clearing fast enough. Thingles (talk) 09:59, January 7, 2013 (CST)


 * :) Probably there are some people constantly fiddling around with templates such as taxoboxes etc. at the moment. + from me. --&#91;&#91;kgh&#93;&#93; (talk) 14:00, January 7, 2013 (CST)

Reconsidering Farms
I've been reconsidering the issue of wiki farms that you brought up before. I'm thinking that my idea of tagging isn't really enough for farms/hosters/platforms. Tags are great for a bunch of things, but the farms themselves end up having data. I'm thinking that farms should be a specific (single) property of websites, and that those should populate "farm pages" in another namespace (very much like the extension namespace). Those pages will then be their own objects and hold aggregated data for the form itself. I think that would be the place where a farm administrator would go to see status across their platform.

I'm not eager to attempt to import 30,000 Wikia wikis, but I am eager to see aggregated performance across my 14 or so wikis in my garden. And I can definitely see some smaller platforms with hundreds of wikis using WikiApiary actively.

What do you think? And if you agree, what would the namespace and property be called? I see one property of a farm being it's type or classification which would be autodetermined based on the number of wikis in the farm ( < 20 = garden, < 100 = ???, < 1000 = ???). Thingles (talk) 06:57, January 8, 2013 (CST)


 * I think that this is a good idea as you may already have expected. :) One thing that could be avoided is the statistical imbalance I already mentioned, the other is the specific aggregation you just mentioned. I would call the namespace "Wiki farm" since this seems to be the established and most commonly used term for it. I would however not do a named classification therein based on the number of instances within it as you proposed. There are just not enough of them around and the naming would be arbitrary with minimum additional benefit. In case you insist on this you could just call it "small wiki farm" ( > 10), "medium wiki farm" ( > 100), "large wiki farm" ( > 1000). The minimum number of wikis to qualify for the "Wiki farm" namespace should be 10 I guess. --&#91;&#91;kgh&#93;&#93; (talk) 09:00, January 8, 2013 (CST)


 * Excellent. I'm thinking just "Farm" and "Farm talk" for the namespaces, and Property:Has farm. I planned on just making these automatically created pages, similar to extensions. This would not allow for a minimum instance count before creating though. It does seem to me that a property of a farm should be the website count in it, regardless of whether that is named. It would also think that a farm type would make sense (commercial, non-profit, hobby). Thoughts on other properties of farms? I'll wire this all up in the next day or two. Fun! :-) Thingles (talk) 09:14, January 8, 2013 (CST)


 * Yeah, go for it! "Farm" and "Has farm" is ok, also the properties about size and type. Perhaps a property about establishment would be nice to have, but I am not so sure about it. --&#91;&#91;kgh&#93;&#93; (talk) 09:34, January 8, 2013 (CST)


 * By establishment do you mean the organization (if any) that runs the farm? Thingles (talk) 09:57, January 8, 2013 (CST)


 * Yep, Wiki was established in 2004 I guess, your farm came into existence in ... --&#91;&#91;kgh&#93;&#93; (talk) 09:59, January 8, 2013 (CST)


 * Farms now exist! Farm:Thingelstad.com I decided to not automatically create the farm pages. Thingles (talk) 20:02, January 8, 2013 (CST)


 * Not to do it automatically will not be a problem since there are only two hands full of farms around anyway. A great thing to have them. --&#91;&#91;kgh&#93;&#93; (talk) 14:19, January 9, 2013 (CST)


 * I fluffed up the templates and forms for farms. :) Cheers --&#91;&#91;kgh&#93;&#93; (talk) 16:29, January 10, 2013 (CST)


 * Thanks — looks awesome! Thingles (talk) 21:57, January 11, 2013 (CST)


 * And thank you for rectifying the shortcomings of my changes! --&#91;&#91;kgh&#93;&#93; (talk) 09:30, January 12, 2013 (CST)

✅

Bulk Loading
Check out Bulk import websites. Clearly the quickest way to add a bunch of sites, if so desired. I used it to import 13 websites tonight as a test, just pulling one of the groups from S23 WikiStats. Worked easy. Thingles (talk) 21:09, January 8, 2013 (CST)


 * That's definitely the way to go. The cool thing is, that Data Transfer recognises existing pages and does not import if instructed. So, no typos in titles please. :) --&#91;&#91;kgh&#93;&#93; (talk) 14:18, January 9, 2013 (CST)

Redid Main Page
Took a first attempt at a total revamp of the Main Page. Decided to try using the article count as a grouping mechanism. Step in the right direction... Thingles (talk) 21:57, January 11, 2013 (CST)


 * Definitely a good change. :) I just adjusted the format of the general stats section. Probaby some of it may go into a separate class. I moved the bot stats out since this is not interesting at all for regular visitor. Also they are unrelated to sites and farms anyway. Please move it in again in case to think it should be there. --&#91;&#91;kgh&#93;&#93; (talk) 09:33, January 12, 2013 (CST)

Lot's of sites!
Wow — you are adding sites like crazy! That's awesome! Thanks! Thingles (talk) 09:22, January 13, 2013 (CST)


 * Yeah, but for the time being only sites of which I believe that they are not properly listed elsewhere, e.g. Biowikifarm wikis. The rest is for Bumble Bee. :) Cheers --&#91;&#91;kgh&#93;&#93; (talk) 09:25, January 13, 2013 (CST)


 * I added the Plantnet farm under the biowiki farm. I like the ability to have farms inside of farms. :-) I actually think for large imports I'm gonna just do a CSV bulk import rather than a bot. Main area to use a bot would be to maintain synchronization, but if it's just a one time load a CSV will be easier. Thingles (talk) 09:57, January 13, 2013 (CST)


 * Hey, that's what I kept for tomorrow to do. :) Ah, I remember that we already talked about the CSV import. You are right, this is what I meant. A hundred a day from the MediaWiki section (unaffiliated wikis)? --&#91;&#91;kgh&#93;&#93; (talk) 10:09, January 13, 2013 (CST)

Page Protections Added
I put in place some protection levels around the existing Project:Editors and Project:Operators groups. I added protection to the common forms and templates, as well as Main Page. You should be fine editing, but let me know if you see any issues. See Special:ProtectedPages for details. This is the first time I've setup custom protection levels. Was really happy to find this $wgRestrictionLevels 101 post to figure it out. Thingles (talk) 14:41, January 13, 2013 (CST)


 * Admittedly I have never used this parameter so far. However, it is great to know about it. I added the link to the tutorial to the talk page of mw.o. :) In case there is a problem I will drop you a note but everything seems to be ok. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 15:35, January 13, 2013 (CST)

Default timezone to UTC?
Right now WikiApiary is sharing the default timezone that all my Farm:Thingelstad.com wikis share. I’m thinking I should switch it to UTC. Agree? Thingles (talk) 16:38, January 13, 2013 (CST)


 * Strong support. Since the focus of this wiki is global ... --&#91;&#91;kgh&#93;&#93; (talk) 16:53, January 13, 2013 (CST)


 * Thought this would be easy, but sure MediaWiki is insisting on remembering "America/Chicago" timezone. Still working on it. Thingles (talk) 19:42, 13 January 2013 (CST)


 * Think I got it now! Had to explicitly do some stuff differently given the farm config I use. Thingles (talk) 01:50, 14 January 2013 (UTC)


 * It's also worth noting that all the data collected on wikis has always been in UTC, so moving the wiki to UTC was definitely needed. Pretty sure it's all good now. Thingles (talk) 02:43, 14 January 2013 (UTC)


 * Looks ok to me. Thank you. :) --&#91;&#91;kgh&#93;&#93; (talk) 20:09, 14 January 2013 (UTC)

✅

Wikicafe/General
I saw you edited Wikicafe/General. That edit will get overwritten on the next update. Was there an issue with the data? It looked like you added data that Bumble Bee wasn't able to see? Thingles (talk) 13:52, 14 January 2013 (UTC)


 * That's true, the data I added were not detected for some reason. Since I believe that the data collected is pretty static and do not think that Bumble Bee will update this page often I thought I give it try. Obviously the data will get lost on the next update and I am well aware of it. Since it is the biggest semantic wiki I will have a closer look at it in future anyway, now that it is here. Cheer --&#91;&#91;kgh&#93;&#93; (talk) 15:15, 14 January 2013 (UTC)


 * Hey, Bumble Bee tricked me. It updates as soon as there is a difference between the data collected and the data saved. :) --&#91;&#91;kgh&#93;&#93; (talk) 09:22, 15 January 2013 (UTC)


 * Bumble Bee is actually even dumber than that. Everynight Bumble Bee gets the list of extensions, builds a page of wikitext to describe that and saves it to the wiki. MediaWiki is the one that is smart enough to say "Hey, that's the same thing I have now. I'll accept this as a  event and not put anything in the history." So, anytime there is a difference between the page and Bumble Bee the change will be blown away within 24 hours. Thingles (talk) 17:16, 15 January 2013 (UTC)


 * Ouch, having been tricked by a dumb bee is not nice to read. ;) And I thought it would check by itself if there was a change in comparison to the day before. My thinking was of that of a truly much more complicated solution. I tend to be like this. This lesson is learned. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 17:48, 15 January 2013 (UTC)

Monitored by
Thought you might find Monitored by fun. Way for sites to spread the word! Farm:Thingelstad.com has this on all wikis. :-) Thingles (talk) 03:25, 16 January 2013 (UTC)


 * I truly do and I will add it to my websites soon there possible. By the way, which software do you use to create it? Cheers --&#91;&#91;kgh&#93;&#93; (talk) 09:58, 16 January 2013 (UTC)


 * I used Acorn to create the layered image. And then crushed another 30% of bits out of the final PNG with ImageOptim. I'm no designer, but I think it came out decent enough. Thingles (talk) 13:07, 16 January 2013 (UTC)


 * It looks very professional to me, and probably only a few will see the need for improvement if at all. I was asking because I am not quite happy with mine. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 17:20, 16 January 2013 (UTC) PS Admittedly, the last time I worked with an Apple machine was in '88 on an Apple II. I am sure there is somebody nearby using an Apple.


 * I just realised that my little sister works with an apple. I just transcluded the new monitored by on CAcert in Berlin. I think I am the first one to do this apart from wikis on the Thingelstad.com farm Cheers --&#91;&#91;kgh&#93;&#93; (talk) 23:52, 20 January 2013 (UTC)

Released WikiApiary to Semantic MediaWiki Users
You'll see I sent an email to the Semantic MediaWiki Users mailing list introducing WikiApiary! I wasn't 100%, but I wrote it and hit send before I could change my mind. :-) If hundreds and hundreds of sites get setup, I may have to scramble. But we'll see. I put a special thank you to you in the email — thank you! :-) Damn.. looks like it stripped all my links out of the email. Thingles (talk) 04:57, 16 January 2013 (UTC)


 * Yeah, I read it and would like to say thank you that you appreciate my help on your project. This was very kind of you. Thanks a bunch! Cheers --&#91;&#91;kgh&#93;&#93; (talk) 17:15, 16 January 2013 (UTC) PS I am keen to see the momentum it will be picking up from the SMW Community. PPS Before making it public to a much wider community still some work has to be done, I guess.


 * It looks like WikiApiary picked up a few websites directly from the SMW mailing list, and got on peoples radar. There wasn't any huge spike in traffic or usage. I did see in Piwik that a number of people spent more time than usual on the site. I agree that there is more work and cleaning to do before spreading the word further. My main concern is getting the new version of the bots in place. Do you have a list of things that you see as being important next steps? Maybe we should put a short list on Project:Roadmap? I'd like to get the word to the broader MediaWiki users mailing list next. Thingles (talk) 13:24, 17 January 2013 (UTC)


 * The list is subscribed by about 500 people and judging from the traffic it gets, it will probably still take some time until all have seen the announcement. Admittedly I had expected a bit more visible enthusiasm, but trust me the bee will collect them. The most important thing next to what you already stated is to work a bit on the look and feel of the website. I have done some changes to main page or to the forms but I still think that this is preliminary. I think I dropped some ideas about it. The other thing is to get the Statistics page in order. Now you have to do a lot of scrolling and the load time is not the best either since the rendering of the graphs take their time. Probably it should be split into two pages (standalone and farm). Also the other main pages need a bit of work. So far I have mostly done some changes to the extension's main. To cut a long story short. Let's work on the roadmap. --&#91;&#91;kgh&#93;&#93; (talk) 22:17, 17 January 2013 (UTC)


 * Related, would you like an account on my Piwik server so you can see traffic for WikiApiary? And/or, would you like to get a weekly report of traffic on WikiApiary? Feel free to say no, just wanted to offer since you have been very active. Thingles (talk) 13:24, 17 January 2013 (UTC)


 * I am curious, so why not? :) I believe I already subscribed to the WikiApiary weekly traffic report. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 22:17, 17 January 2013 (UTC)


 * Great. Details just sent in email. Thingles (talk) 13:34, 18 January 2013 (UTC)

Extension pages have more data
I just modified the Bumble Bee script that writes the  pages to add three additional fields that are available in the API response that I was previously just ignoring. This diff shows the new data. I have not modified Template:Extension in use to pick up these new fields or set additional properties for them, yet. Thingles (talk) 13:20, 17 January 2013 (UTC)


 * Yeah, this is good. I believe we should now split up "tags" into "type" and "author" which would already have made sense earlier. I would rather not populate the extensions pages automatically but doing an import with the extra data once information about an extension is available. Will it be possible to have a spreadsheet with all extensions data so I can work on it for initial import. Except for the version and to some extent the author these data are pretty static. So automatic changes should only be done for the version if possible. The way authors are provided is very inconsistent and does not look very professional. --&#91;&#91;kgh&#93;&#93; (talk) 16:29, 17 January 2013 (UTC)

✅

Talk:BrewWiki.com
Take a peek at the debugging I did on Talk:BrewWiki.com. Thingles (talk) 05:38, 19 January 2013 (UTC)

Collection errors on talk pages
You'll see that User:Bumble Bee started to record errors in collecting statistics to Talk pages for sites (Talk:29C3 Public Wiki‎, Talk:Heckipedia‎). I thought about writing a Template so that it could be styled in the wiki, but opted for just writing the exception text out full. I got worried the exception may contain special wikitext characters. But, a template may be the better way to go. I also changed the edits to bot edits after the first trial runs so it won't fill up Special:RecentChanges. Thingles (talk) 22:14, 20 January 2013 (UTC)


 * Hmm, even just seeing a few entries on Talk:BrewWiki.com I'm not liking how the current format I'm logging these errors in pollutes the TOC. Open to suggestions on how to format the error notices. I think ultimately the TOC should only be populated by human conversation, and bot errors should be recorded inline in as they occur, but should be visually separate and more subtle. Thingles (talk) 22:34, 20 January 2013 (UTC)


 * (Edit conflict) Having the error message like this is is fine. The only thing I wonder about is how frequently Bumble Bee will do this. Probably only after there were successful calls thereafter followed again by a problem. --&#91;&#91;kgh&#93;&#93; (talk) 22:40, 20 January 2013 (UTC)


 * I do not think that we will have a lot of talk on website, farm, or extensions talk pages, so having these infos seems ok to me. These talk pages are more like workbenches for the person concerned. So far most of the talk was made on user pages here and most issues that will come up probably will continue to take place on user pages. Talk on other pages will probably be related to Bumble Bees edits. --&#91;&#91;kgh&#93;&#93; (talk) 22:40, 20 January 2013 (UTC)

Crazy Stuff to get at version info
Check out User:Thingles/Scratch4 for some truly insane stuff working to get a better way to detect version information from extensions. :-) Thingles (talk) 18:53, 21 January 2013 (UTC)


 * WOW! I feel like very stupid when looking at the code. :) --&#91;&#91;kgh&#93;&#93; (talk) 19:29, 21 January 2013 (UTC)

Generated Authors for Extensions
See User:Thingles/Scratch3 to see how we can now get a unique set of authors for extensions automatically. This could be displayed somewhere on Template:Extension. I think there should still be a form for authors that is manually controlled. But this could make it easy to populate that, and serves as a secondary display of all users recorded around the wikisphere. Something similar can be done for Property:Has extension type and Property:Has extension URL. Thingles (talk) 21:22, 21 January 2013 (UTC)


 * To separate type and authors is a very good idea. During the past three weeks I have gone through about 150 extensions. In a lot of cases I have added more than the one available tag, e.g. Semantic MediaWiki because all of them apply next to the standard one provided by the extension itself. In a lot of cases I also amended the extension's description to make it more consistent or understandable at all. In general, as you also will have noticed, the extension registration is quite an awful mess, so working on them will be desirable. I think we could move the data in for all extensions I have not worked on so far, if possible, since the autopopulation would be a regression there. Thus we would have data for all extensions. Because of this it would be great to be able to edit all information bits. The data on the URL, tag and description should be quite solid I believe. Perhaps not automatic updates for them after they were added once. Hmm... --&#91;&#91;kgh&#93;&#93; (talk) 21:59, 21 January 2013 (UTC) PS I will work my way through the ones I have completed so far. So no worry about them. --&#91;&#91;kgh&#93;&#93; (talk) 22:08, 21 January 2013 (UTC) Things are on the roll, I guess. :) --&#91;&#91;kgh&#93;&#93; (talk) 22:19, 21 January 2013 (UTC)


 * I like your focus on a quality set of content for extensions. I think that's good. Just exploring some options, what if we added tabs to the Extension template and had the curated, clean descriptions and such on the first tab and then had a second tab that displayed the harvested information that User:Bumble Bee gets when he pulls? This data would be knowingly less accurate and noisy, but would be very complete with no intervention. Thingles (talk) 23:16, 21 January 2013 (UTC)


 * However, quality makes things harder to maintain. I do not think that having two tabs is a good option since it will perhaps cause confusion and requires a new set of properties. What about doing an #if nothing was added manually to the specific property on the extension's page then #ask for this information as Bumble Bee harvested it and add it to the property? --&#91;&#91;kgh&#93;&#93; (talk) 15:04, 22 January 2013 (UTC) PS When it comes to authors and versions always the information harvested by Bumble Bee should be set, i.e. no manual updates here. --&#91;&#91;kgh&#93;&#93; (talk) 16:10, 22 January 2013 (UTC)

Importing 7,361 wikis
Okay, I've got my eyes squarely on Pavlo's Alive Filtered list of wiki API endpoints. 7,361 wikis. It looks like this is just api.php endpoints.

I've grabbed this file and it will not be hard for me to write a one-time bot that will pull the name of the wiki via the API endpoint and create a minimal Web site template page for the wiki. The more challenging part will be trickling them in, and I'm wondering what you think of a different approach.

First off, Bumble Bee would get crushed as it is with 7000 more websites (I've moved him to running every hour already btw). The new version I think will work fine, but I know the old one wont. How do you feel about importing all 7,361 (minus ones that don't respond to an API request) with Property:Is validated and Property:Is active set to False. Then, over time groups of them are turned on as they are validated and activated, just like they would be if an end user submitted them. I think that would work fine, but it would create a huge list of inactive sites for a while. Thoughts?

I guess the other option is I have a "skip lines" value in the bot and run it repeatedly with ever increasing skip lines, but it seems like the net effect would be the same just more gradual.

Thingles (talk) 02:44, 26 January 2013 (UTC)

Okay, that was quick. I have the bot done. It queries the API URL to get the site name and builds out the Template:Website block. I have saving to the wiki disabled for now. See Project:Pavlo import project for the example run I just did with the first 20 records. I'm not doing any checking to make sure it doesn't overwrite one of the existing websites. Not sure how critical that would be. This runs as Audit Bee.

Thingles (talk) 03:09, 26 January 2013 (UTC)


 * Yep, this can be done to get things going. What about doing a first batch with 700 pages to see what happens? Hmm, we really could use a second or third operator. --&#91;&#91;kgh&#93;&#93; (talk) 10:12, 26 January 2013 (UTC)


 * Significant progress, see Pavlo import project‎. Thingles (talk) 14:19, 26 January 2013 (UTC)


 * I am getting more and more excited. The automated import of the logos is really a big relieve and was the biggest speed breaker. Great --&#91;&#91;kgh&#93;&#93; (talk) 14:52, 26 January 2013 (UTC)


 * I see you making a lot of edits. Note that Audit Bee (as it is) will overwrite those and require an undo. I'm feeling like we are pretty good to go and should just start the import to bring them all in? Thingles (talk) 15:25, 26 January 2013 (UTC)


 * Why so? Will Audit Bee start at the beginning again? --&#91;&#91;kgh&#93;&#93; (talk) 15:28, 26 January 2013 (UTC)


 * Good point... I can just skip the first 100 lines. :-) Duh. Thingles (talk) 15:29, 26 January 2013 (UTC)


 * I am relieved. I thought you might be doing batches at 100 anyway. --&#91;&#91;kgh&#93;&#93; (talk) 15:32, 26 January 2013 (UTC)


 * I have gone trough some imported wikis now. I guess I will be able to work at 40 wikis a day, so about 180 days to go. --&#91;&#91;kgh&#93;&#93; (talk) 18:22, 26 January 2013 (UTC)


 * :-) Don't go too fast, I'm worried about Bumble Bee getting overloaded before I have the new version ready. Also, I added some very helpful links and buttons to the verify and activated stuff. Added an autoedit form link too. Much nicer. The main thing is the links for the API calls Bumble Bee depends on are there so they can be verified for real before activating and then generating errors. Thingles (talk) 20:38, 26 January 2013 (UTC)


 * Bumble Bee tricked me, so this is my late revenge. :) Still about 40 quality pages a day not causing problems should be ok? Moving the info into separate templates and allowing direct edit is great. The latter more for further than for initial page maintenance but this is very ok. I will add hints regarding the required minimum version for the different API modules as it is already done for SMW. This will increase the quality of my edits. --&#91;&#91;kgh&#93;&#93; (talk) 20:52, 26 January 2013 (UTC)

Image uploading for Project:Pavlo import project
Seeing how many images have been uploaded today for this massive import, I'm so happy I took the extra hour to figure out how to automate that, even if it's far from perfect. Wow. Thingles (talk) 05:09, 27 January 2013 (UTC)


 * This is indeed a big time relief. The other very good idea was to provided the links for the API calls. This makes things much easier. This extra hour was a very good investment! Lot's a kudos for ya. --&#91;&#91;kgh&#93;&#93; (talk) 09:48, 27 January 2013 (UTC)

Also, when this is done the data set in WikiApiary is going to be much cleaner than others I've seen. A lot of dead websites in that list, about 30%, and a lot of duplicates. Thingles (talk) 05:11, 27 January 2013 (UTC)


 * Yep, but more importantly it seems much easier here to keep the data sane. So we have build-in data sustainability. --&#91;&#91;kgh&#93;&#93; (talk) 09:48, 27 January 2013 (UTC)

Redirects
This made me LOL foks, have you ever heard of permanent redirects, that's spamming, keeping the first version. :-) Thingles (talk) 13:35, 27 January 2013 (UTC)


 * Yeah, I had a jolly good one, too. :) --&#91;&#91;kgh&#93;&#93; (talk) 13:38, 27 January 2013 (UTC)


 * That's cool and must have a meaning. Bumble Bee just created the page for the FindSpam extension. :) --&#91;&#91;kgh&#93;&#93; (talk) 13:39, 27 January 2013 (UTC)

Popup graphs
Just an FYI on the revamp of the graphs. I'm planning on moving the 6-8 graphs on the site pages to three larger graphs that will resize with the window. The plan is to highlight the 3 most important graphs. Each graph will have common controls, and the exciting part, will be able to popup into a new window. I'm focused on first making the new window work. You can see work in progress at this popup graph for Wikipedia and WikiApiary. That's my plan. I'm excited by the idea of people being able to have several graphs open at the same time. With some Javascript to load new data it becomes an instant dashboard! Thingles (talk) 14:16, 27 January 2013 (UTC)


 * This is b....y cool. Can it get better?!? Are you merging the graphs? All of them are interesting and should be there. I am not sure if too much information in one graph it good. Still I probably have to see the result. Cannot wait. :) --&#91;&#91;kgh&#93;&#93; (talk) 14:22, 27 January 2013 (UTC)

Project:Pavlo import project done importing
All files have been processed! I also did a specific fix for any site that has had statistics collected but had been marked as no longer verified and fixed those up so that there won't be any big breaks in sites that are already being monitored. Thingles (talk) 16:43, 27 January 2013 (UTC)


 * This is good. I already reverted a couple which were all of a sudden lacking a logo on main or were on my watchlist. --&#91;&#91;kgh&#93;&#93; (talk) 16:47, 27 January 2013 (UTC)

Form:Extension
I can't remember if we covered this already. I want to go ahead and (as you suggested earlier) add real form fields and properties for Category:Extension instead of using Tags for everything. That good with you? You've been doing most of the work in that namespace so wanted to confirm. And the plan is to populate with the automatic data from collection unless a manual value is specified. I'll add a boolean indicating if manual or automatic is being used so it's easy to find and add high-quality info. Thingles (talk) 19:46, 28 January 2013 (UTC)


 * Yeah, I think there is some talk about this around. I added a lot of info there to assure that at least the 150 most popular extensions a filled with info for the visitors of WikiApiary. Will you add the boolean per info item or for the extension page in total. The version info may always be updated automatically. I am not sure on which version we settled by the current "tag" should perhaps be divided up into "type" and "authors". Do you think we need an additional "tag" property. Hmm, it's already there so we may as well provide it. The future will tell if it is really needed - yes, we need it at least to indicate if an extension is website specific. However, you man move on as you believe it is best. You are full of good ideas so I am going to trust you on this one. :) --&#91;&#91;kgh&#93;&#93; (talk) 20:25, 28 January 2013 (UTC)


 * This is now done! Pretty cool! I put the update about it at WikiApiary talk:Operations/2013/January. I'm trying to move more of the ongoing discussion into the Operators pages and off of our individual talk pages. Thingles (talk) 02:15, 31 January 2013 (UTC)


 * This is very, very awesome. :) First I thought: Why doesn't he move in the descriptions, too? But then I realised. "De-cluttering" our user-pages is the way to go. --&#91;&#91;kgh&#93;&#93; (talk) 17:57, 31 January 2013 (UTC) PS I put working through the existing extensions for transition to the new model on my working list. No worries here.

Awesome!
''I've never made a barnstar, and I finally have a reason to make one so thought I would give it a go! :-)''


 * This is very kind of you and I gracefully accept the barnstar! So this is the bronze version of it? :) I will now be working on getting silver or gold. :) --&#91;&#91;kgh&#93;&#93; (talk) 08:53, 1 February 2013 (UTC)


 * It's not bronze, it's the original one. :-) These are fun. Thingles (talk) 04:51, 2 February 2013 (UTC)


 * Then I will have to work hard to get more of them. :) --&#91;&#91;kgh&#93;&#93; (talk) 09:32, 2 February 2013 (UTC)

Adding Skins?
I keep coming back to this idea, it seems likely I'll add it at some point. Curious what thoughts you have on it. The MediaWiki API has a method to get the skins a site has installed (see results for wikiapiary). I'm not sure when it was added. It wasn't on the MediaWiki documentation in the wiki until I added it recently, but it is in the API text when you dig. There is really no information given except the name of the skin, but maybe that will get better in the future and be more like extensions. My thought is to make a Skin namespace and do exactly like I do now for Extensions for Skins. Pull them in, have pages autocreate, etc. I feel like skins are MediaWiki's weak spot, so my thought is it's a baby step to showing what skins are out there. Unlike Extensions where the focus is sort of by default on the ones used on the most websites, Skins would probably highlight the ones used on very few websites. Would be a way to highlight some of that diversity. Curious what you think? Thingles (talk) 02:19, 1 February 2013 (UTC)


 * To say that skins is a weak spot of MediaWiki is a kind of understatement. Wiki-specific skins have proven, at least this is my strong feeling, to be the major holdup with regard of upgrading. So yes, it would be very good to have these data. The most interesting part will be the one about the custom skins and the MW versions they use. There is currently a tendency to inject skins via extensions so we will have to identify them too within the extensions namespace but most of them will still be traditional approaches. I will test wikis to find out which MW version is required in minimum for this. --&#91;&#91;kgh&#93;&#93; (talk) 08:59, 1 February 2013 (UTC) PS I just checked. This was just recently added in MW 1.18 --&#91;&#91;kgh&#93;&#93; (talk) 10:05, 1 February 2013 (UTC)


 * This is largely done! Skin:Main Page I cloned the extension bot to populate these. Probably need to write a bot that finds all wikis > 1.18 to and sets the flag to collect skin data. Thingles (talk) 14:02, 1 February 2013 (UTC)


 * Great. :) It will probably already be the fastest way to activate data collection by doing this with a bot. What is a bit sad is that the API does not consider the settings made with the  parameter, meaning that probably almost all wikis will have the bundled variety of skins available. Admittedly I tried to remove some of them "physically" but got heaps of PHP Notices in the error log. Haven't had a closer look at this since, though. --&#91;&#91;kgh&#93;&#93; (talk) 22:19, 1 February 2013 (UTC)


 * Yeah, this API call seems like it was thrown in without much thought. For example, the property name for the skin is very oddly named  (yeah, an asterisk). My guess is that is a bug, but it's working so fine for now. To the extent you have sway in MediaWiki core it would be really nice if this API method at least indicated which skin was the default for the site. If just that were there, it would give much more value to Skin:Main Page. Thingles (talk) 04:50, 2 February 2013 (UTC)


 * I have not even gotten that far. Yes having an information about the standard skin and the installed but skipped skins would definitively enhance the value. I guessI will open an enhancement bug for this. However there is already value here. The naming problem should be a bug report in any case. --&#91;&#91;kgh&#93;&#93; (talk) 09:35, 2 February 2013 (UTC)

Hey dude!
Just hitting your talk page to say thanks for continuing to validate and activate! You are a machine. I've been quiet recently mostly cause I got completely sidetracked in another wiki project (Planet Kubb Wiki). Just wanted to let you know I'm still here and am close to getting the revamped User:Bumble Bee bot up and running. I expect to have it actively collecting by the end of this weekend. That's it. Thanks. Thingles (talk) 19:51, 19 February 2013 (UTC) PS: Do you happen to play Kubb?


 * Great to read about the progress with Bumble Bee and that you are still here. :) It was getting a bit lonely here. You really need stamina and time to do the validating and activating. However, I think it is worth the effort. Actually I noticed that you are working on Planet Kubb since it was on main page a couple of times. No I do not play Kubb and every time I had the chance to learn it I got stuck at the BBQ which is a great passion of mine. :| Cheers --&#91;&#91;kgh&#93;&#93; (talk) 22:43, 19 February 2013 (UTC)


 * BBQ is a great passion of yours?! Me too! Check out my Big Green Egg setup and various blog posts on Big Green Egg. I'm a bit of an Egghead. :-) Thingles (talk) 15:44, 20 February 2013 (UTC)


 * No just and Egghead, but a ruler too. :) I have never heard or seen of these Barbieeggs, however I already like the ides since it should easily be possible to do a nice roast with it, which you usually cannot do with regular setup. I already looked for retailers here and found them. However none of them is quoting prices. :) --&#91;&#91;kgh&#93;&#93; (talk) 19:26, 21 February 2013 (UTC)

Bumble Bee v2.0 live!
I just flipped the switch over to the new version of Bumble Bee. This is the one that uses Property:Has bot segment to divide up work, and also actually honors Property:Check every. This also means that Bot log is real, and is reporting real items. You'll note that right now there are no log messages from Bumble Bee that he finished the run, which means that an exception is being thrown before he finishes. I'm getting those in my mailbox and will work through them. It's worth noting that previously Bumble Bee was 4 separate scripts; one each for stats, extensions, general info and skins. The last three ran once a day, in a big batch. Now these are all one. So, the websites that are validated and activated will get their subpages (General, Extension and Skins) populated within 15 minutes. Thingles (talk) 01:01, 24 February 2013 (UTC) PS: You can see all the source code that is running on Github. Of course this lacks the security credentials, but if for some reason you want to see what is happening there it is.


 * I seem to have the initial kinks worked out. The bot is regularly running all the way through. This will be a solid code base to build on and add more robust capabilities. The apiary lib that this uses will also make it much easier to get User:Notify Bee and User:Audit Bee running. Exciting! Thingles (talk) 02:32, 24 February 2013 (UTC)


 * Also, Project:Bot log will be a great place for any operator to see errors in collection. Previously those were only in my mailbox. :-) Thingles (talk) 02:32, 24 February 2013 (UTC)


 * This is great news adding more dynamics to the WikiAPIary. This is especially important for impatient people that add their wiki and want to see it get going here right away. The bot log will uncompromisingly show if operators did a good job and, this is more important show if something is going berserk on a wiki. I cannot wait for things getting even more exciting. :) Cheers --&#91;&#91;kgh&#93;&#93; (talk) 10:07, 24 February 2013 (UTC)


 * It was super fun looking at Special:RecentChanges as you were validating some stuff this morning. User:Bumble Bee is right there, just a few minutes behind putting new information in place. Very cool! Thingles (talk) 13:23, 24 February 2013 (UTC)


 * I just realised that there is a little Bumble Bug: Bumble Bee stopped collecting the http-server information. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 10:55, 24 February 2013 (UTC)


 * Good eye! :-) The way I setup the classes made it hard for me to figure out how to pull this information out. I'm going to add it back in, but I decided dropping it for a bit was worth getting the new version of the bot live. Thingles (talk) 13:23, 24 February 2013 (UTC)


 * Fair enough. This is an interesting but not a vital information. I guess we can wait until it is getting collected again. --&#91;&#91;kgh&#93;&#93; (talk) 19:39, 24 February 2013 (UTC)

Faster +a +v ?
I had three main concerns with validating and activating a large number of websites for collection in bulk.


 * 1) If there was an error in collection, I was the only person that could possibly see it since those errors came back to me from cronjobs and nobody else could remedy it. In effect, WikiApiary could be clogged with dead URLs and other garbage and nobody would even know, much less be able to fix it. Project:Bot log remedies this problem. Now anyone can see errors. It also sets the stage for User:Audit Bee deactivating sites that are in long states of errors.
 * 2) There was also just a pure limit in the old script that stopped after 1,000 websites. It was also serial. Using Project:Bot segments fixes this.
 * 3) The individual sites Property:Check every was ignored before, making scaling up challenging.

The new User:Bumble Bee fixes all these problems. Project:Bot log is visible and allows any operator to remedy issues. Project:Bot segments are in place and can easily scale up to many more segments to deal with a practically unlimited number of sites. And since Property:Check every is honored we won't over or under collect from sites.

With this in place, I'm thinking that we should consider separating the validation/activation from the curating tasks (tagging, description and logos). I'm thinking we should consider just bulk validating and activating somewhere between 100 and 500 websites at a time (I would script this), then monitor Project:Bot log for issues and correct, then do another. It would only take a few batches to do all of the sites in Pavlo import project.

Thoughts? Thingles (talk) 14:03, 24 February 2013 (UTC)


 * We will have to this since it would take several months at the current speed until all wikis are validated and activated. We should probably start with chucks of 100 at the beginning to see how many of them run into errors. The most common one will be the error from failed extension data fetches, because the collection of extension's data is activated for all imported wikis. About 18 % of the wikis imported have a MediaWiki version of 1.13.x or lower. Another 5 to 10 % will be running into all sorts of other problems. Thus I expect one quarter of the wikis to run into errors, which is quite a lot. However this corresponds to the experience I gained so far from validating. What still has to be done manually is working on the Category:Multiple website. Still a good month to go, but this should be ok, I guess. --&#91;&#91;kgh&#93;&#93; (talk) 20:09, 24 February 2013 (UTC)


 * Hmm, well, I will put together a script to run under User:Audit Bee to activate a number of sites. This collecting what from where problem might need a rethink. The reason I put checkboxes on Form:Website for this was to allow someone to opt-out of collecting that info if they wish. Otherwise I could just automatically determine what to collect based on MediaWiki version and if SMW is installed. A compromise might be to have an Audit Bee process that can be run from time to time to set certain flags. Skins is the notable one here. Since that wasn't defaulted, few websites have it on. Another option would be to turn the checkboxes from turning things on to turning them off. Then based on MediaWiki version just pull everything, unless there is a check for "Do not collect extensions". Thingles (talk) 05:03, 25 February 2013 (UTC)

Link wiki?
I saw that you gave my link wiki a vote. Thanks! I wanted to let you know that I've been experimenting with replicating that project to other people. I've already set a replicant up for my friend User:Garrickvanburen. If your interested let me know. You would need a blank wiki with the standard Extension:Semantic MediaWiki suite and then I just need a bot account to set it up. By the way, this is what I'm working on setting up on MediaWiki Cookbook under packages, Link Wiki. Just offering... Thingles (talk) 04:59, 25 February 2013 (UTC)