User talk:Kghbln/Archive 01

-u-a / -u+a / +u-a

Hello
Thanks for signing up and registering a website. I'm still actively building out WikiApiary but you will shortly see your statistics start to show up. Have fun and hope you find the Apiary useful! Thingles (talk) 20:13, December 28, 2012 (CST)


 * Heiya Jamie, I just saw that you added this site to the Community Wiki and thought I give it a try. A great idea and I am looking forward to how it develops. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 20:15, December 28, 2012 (CST)

Website Summaries
I saw you added summary/description text to some of the wikis you added. I've been thinking I should add that as a property for the website form. Seeing you add it makes me think that I definitely should! :-) BTW, thank you for adding more wikis. That is awesome! Thingles (talk) 07:41, December 29, 2012 (CST)


 * I really like this wiki. The idea of providing a brief description is from the smw community wiki. Having an extra field for this, including a related property is a good idea. In return you could remove the free text field. I am not so sure about the tags I should use and this is why I have not done is so far. Probably autocompletion on property values is the way to go here. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 07:52, December 29, 2012 (CST)


 * I've added the field to the form, added autocompletion and removed the free text area as you suggested (diff). Something is amiss with my autocompletion though. I've noticed this on another wiki of mine, need to debug what is happening there. Thingles (talk) 08:01, December 29, 2012 (CST)


 * Great! Looks good to me - approved! :) Why don't you just use "values from property" instead of "remote autocompletion"? Cheers --&#91;&#91;kgh&#93;&#93; (talk) 08:07, December 29, 2012 (CST)

Indexes on Wikis
I'm planning on adding a number of calculated indexes to WikiApiary shortly. Calculations using the existing statistics that would be interesting. One example I've been considering is an activity index that would be (edits in last 28 days/active users). Another would be admin index (active users/admins). Any suggestions you have on indexes would be great. Thingles (talk) 08:07, December 29, 2012 (CST)


 * You will see I made a Template:Website indexes that is now showing four calculated indexes for all wikis. I also added a table of "Most Active Wikis" to the Statistics page. Pretty fun! Thingles (talk) 08:24, December 31, 2012 (CST)


 * Yay, we two beat Wikidata :) Its getting better here by the day! --&#91;&#91;kgh&#93;&#93; (talk) 08:28, December 31, 2012 (CST)

Editor
I created a new Editors group and have added you to it so that I don't need to patrol your edits. :-) And also so that at some point Forms and Templates can be restricted but you can still edit. Any suggestions on other rights or groups that should exist would be very welcome. Thingles (talk) 06:32, December 30, 2012 (CST)


 * Great, thank you. :) One thing that comes to my mind immediately is . I will provide other suggestions along my way here. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 06:36, December 30, 2012 (CST)


 * Added! Thingles (talk) 06:40, December 30, 2012 (CST)


 * What about  for operators? Cheers --&#91;&#91;kgh&#93;&#93; (talk) 20:13, 21 January 2013 (UTC)


 * Good idea. Done. Special:ListGroupRights. Thingles (talk) 20:37, 21 January 2013 (UTC)

Changing subobjects
Just a heads up on a big change I'm making right now. Right now the bot writes extension and general information to /Extensions and /General subpages of a given website. The subobjects are then attached to that subpage with a linking Has website property. I'm changing this after doing some tests. I'm going to still write the subpages, but the data will be in  blocks. Then I will transclude the subpages into the website page.

This will have two wins. First, queries will be easier to create since all data will be in the website object or subobjects directly connected to it. Also, since a transclusion is in place when the bot changes the subpage it will trigger MediaWiki to update the page with the transclusion and get new data. This does mean some queries will have to be modified. Thingles (talk) 14:55, December 30, 2012 (CST)


 * I am touching wood. Support from me. :) Modifying the queries should not be the problem then. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 15:10, December 30, 2012 (CST)


 * Change is made and should be propagated through. When you look at the properties for a website you'll see all the data now there. Thingles (talk) 15:19, December 30, 2012 (CST)

Header tabs
Funny! I was thinking that adding header tabs to the website template would be a good thing! See you just did that! Sweet. I'll pull the graphs out of the tabs though. I'm signing off for a bit. Youngest is done with nap and we're hanging out. Thingles (talk) 15:22, December 30, 2012 (CST)


 * I guess this was natural thing to do. :) Admittedly, I have to do something else, too. Enjoy your family time now! --&#91;&#91;kgh&#93;&#93; (talk) 15:26, December 30, 2012 (CST)

Sync with Semantic MediaWiki Community Wiki
I was looking briefly at the SMW Community Wiki on Referata and I think it would be pretty simple for me to write a bot that "synced" the websites registered there over to WikiApiary. Do you think this would be a good idea? Bad idea? Inappropriate? Cool? What do you think? In about 15 minutes I already have the query for pulling all the public, active sites from that wiki. Now I just need to make sure to not duplicate and attempt to discover the API endpoint my retrieving the base page and looking at the meta headers. Thoughts? Thingles (talk) 22:31, January 4, 2013 (CST)


 * A natural thing to think about this. Still I would not do it without asking Yaron about his thoughts, since he basically build up and fostered this directory. Otherwise it would appear like a bear taking a yummy honeypot. To some more or less large extent this wiki would supersede the community wiki after this was done. --&#91;&#91;kgh&#93;&#93; (talk) 05:13, January 5, 2013 (CST)


 * Thanks for the feedback. I agree, and I need to focus on getting the collector right before a bunch of hundreds of additional sites are added. Thanks! Thingles (talk) 14:59, January 6, 2013 (CST)


 * Once I get the rewrite of Bumble Bee (below) done and am actually honoring the Check every property I plan on publicly announcing this wiki on the SMW users mailing list. As it is, very few people know it exists. Thingles (talk) 15:05, January 6, 2013 (CST)


 * Still I would approach Yaron before announcing this wiki publicly since some people might see the similarity of both wikis and will ask about it. I am sure he will like this wiki anyway. --&#91;&#91;kgh&#93;&#93; (talk) 16:28, January 6, 2013 (CST)


 * Good idea. I don't know Yaron aside form some posts on the mailing list but when I get this bot stuff sorted I'll drop him a note, and CC you. Thingles (talk) 17:34, January 6, 2013 (CST)


 * Perfect! Thank you for doing this. --&#91;&#91;kgh&#93;&#93; (talk) 12:16, January 12, 2013 (CST)


 * Nice response from Yaron! Thingles (talk) 09:58, January 13, 2013 (CST)


 * Indeed. He is also a big fan of good and innovative ideas ... --&#91;&#91;kgh&#93;&#93; (talk) 10:05, January 13, 2013 (CST)

Bumble Bee on Github
Just FYI, I've started working on a rewrite of the three hacked together Python bots that collect information into one well written bot. If you want to see it there is a WikiApiary repo on Github.

Some highlights:


 * 1) When Bumble Bee cannot talk to a website (it's not responding, returns bad info) I plan on having it write a templated log message to the Talk page for that website. That make sense to you? Would there be a better place for it?
 * 2) Right now Bumble Bee isn't honoring the Check every setting. It pulls every 15 minutes regardless. This one will, but I'm going to move that from Check every hours to every minutes so that some sites (Wikipedia) can have 15 minute collection periods.
 * 3) I'm going to consolidate the three separate scripts that get stats, extensions and general info into this one.
 * 4) I had originally planning on pulling in the list of namespaces. You see that in the websites form. I'm reconsidering that. I'm not sure it adds any value unless I did a page count by namespace. That would be a bit harder than I want to do now, so just a list of Namespaces seems less useful.
 * 5) I'm totally excited about the Property:Has bot segment and running multiple Bumble Bee's at once. I've already got that in the rewrite. I could easily run 4 of these at a time and keep stats collecting very fast without having to write threading code.

Anyway, that is why you see a little less of edits in wikispace from me right now.

Thingles (talk) 15:04, January 6, 2013 (CST)


 * Thank you for keeping me in the loop. After you added the new property I found my way to GitHub. By the way, I am following you now. :)


 * …and I you! :-) Thingles (talk) 17:35, January 6, 2013 (CST)


 * ad 1) That's a good idea. This info belongs to the page since this is a page the admin of the wiki will most likely watch. Still adding semantics to the template for it to be also queried on a central maintenance page is favourable as well as the possibility to update this info within the template in a sense of "I worked on the problem successfully. Done."
 * ad 2) I was already wondering about these checks since my impression from looking at the graphs was that it is doing them much more frequently. My weblog is fully of bee activity too. :) So Bumble Bee is presently as busy as a bee can get. :D Going to minutes is good. Besides, since Bumble Bee is doing these checks on a regular basis anyway an information about the uptime of the respective wiki could probably easily be added to the website's page. I guess this will be very interesting at least for the admins too.


 * If I tried to add availability monitoring as well I would probably put that in a separate bot by itself and have that run every 5-minutes. Not in my plans now, but who knows. Thingles (talk) 17:39, January 6, 2013 (CST)


 * I see. Fair enough. Still an admin fave. :) --&#91;&#91;kgh&#93;&#93; (talk) 13:58, January 7, 2013 (CST)


 * ad 3) When doing this it would be great to have the new structure of the template in mind, i.e. to split up the General information so it may be easily allocated to the new sections. However, as I got to know you up till now, you have already concrete plans to cater for this. :)


 * I plan on leaving the templates as is, just having them all generated in the same Python as opposed to three separate scripts. Thingles (talk) 17:39, January 6, 2013 (CST)


 * Ah, we will have to see how to allocate the infos from the template. Will involve querying. --&#91;&#91;kgh&#93;&#93; (talk) 13:58, January 7, 2013 (CST)


 * ad 4) I was wondering about this one too. However, I did not put much thought into it since I regarded this as a less important information. It is indeed less helpful having the purpose of this website in mind.


 * Unfortunately I don't see any method in the Mediawiki API that allows me to ask for a count of pages in a given namespace. It looks like I would have to ask the wiki to give me the list of page names and then count them, which I think is too expensive. I can envision some crazy future where there is a WikiApiary Extension that could optionally be installed and could collect this type of information locally and present it in a nice API. That will go on the Roadmap. Thingles (talk) 17:39, January 6, 2013 (CST)


 * Probably only evolved wikis use more than the ones provided out of the box. In case you have extra namespaces you could set up  to have their pages included in regular statistics. Having an extra extension to do this will be an option, too. Not on the front burner tough. --&#91;&#91;kgh&#93;&#93; (talk) 13:58, January 7, 2013 (CST)


 * ad 5) A cool idea which will work out great!
 * Cheers --&#91;&#91;kgh&#93;&#93; (talk) 17:01, January 6, 2013 (CST)
 * PS I thought you might have another life with wife and kiddies and a job. No worries about your activity. :)
 * If there were just three of me it would be so much easier to get stuff done! :-) Thingles (talk) 17:39, January 6, 2013 (CST)

Wikipedia Jobs Graph
Check out the jobs graph for Wikipedia (en). Interesting build up. At some point it would be interesting to trigger a notification to admins when jobs are not clearing fast enough. Thingles (talk) 09:59, January 7, 2013 (CST)


 * :) Probably there are some people constantly fiddling around with templates such as taxoboxes etc. at the moment. + from me. --&#91;&#91;kgh&#93;&#93; (talk) 14:00, January 7, 2013 (CST)

Reconsidering Farms
I've been reconsidering the issue of wiki farms that you brought up before. I'm thinking that my idea of tagging isn't really enough for farms/hosters/platforms. Tags are great for a bunch of things, but the farms themselves end up having data. I'm thinking that farms should be a specific (single) property of websites, and that those should populate "farm pages" in another namespace (very much like the extension namespace). Those pages will then be their own objects and hold aggregated data for the form itself. I think that would be the place where a farm administrator would go to see status across their platform.

I'm not eager to attempt to import 30,000 Wikia wikis, but I am eager to see aggregated performance across my 14 or so wikis in my garden. And I can definitely see some smaller platforms with hundreds of wikis using WikiApiary actively.

What do you think? And if you agree, what would the namespace and property be called? I see one property of a farm being it's type or classification which would be autodetermined based on the number of wikis in the farm ( < 20 = garden, < 100 = ???, < 1000 = ???). Thingles (talk) 06:57, January 8, 2013 (CST)


 * I think that this is a good idea as you may already have expected. :) One thing that could be avoided is the statistical imbalance I already mentioned, the other is the specific aggregation you just mentioned. I would call the namespace "Wiki farm" since this seems to be the established and most commonly used term for it. I would however not do a named classification therein based on the number of instances within it as you proposed. There are just not enough of them around and the naming would be arbitrary with minimum additional benefit. In case you insist on this you could just call it "small wiki farm" ( > 10), "medium wiki farm" ( > 100), "large wiki farm" ( > 1000). The minimum number of wikis to qualify for the "Wiki farm" namespace should be 10 I guess. --&#91;&#91;kgh&#93;&#93; (talk) 09:00, January 8, 2013 (CST)


 * Excellent. I'm thinking just "Farm" and "Farm talk" for the namespaces, and Property:Has farm. I planned on just making these automatically created pages, similar to extensions. This would not allow for a minimum instance count before creating though. It does seem to me that a property of a farm should be the website count in it, regardless of whether that is named. It would also think that a farm type would make sense (commercial, non-profit, hobby). Thoughts on other properties of farms? I'll wire this all up in the next day or two. Fun! :-) Thingles (talk) 09:14, January 8, 2013 (CST)


 * Yeah, go for it! "Farm" and "Has farm" is ok, also the properties about size and type. Perhaps a property about establishment would be nice to have, but I am not so sure about it. --&#91;&#91;kgh&#93;&#93; (talk) 09:34, January 8, 2013 (CST)


 * By establishment do you mean the organization (if any) that runs the farm? Thingles (talk) 09:57, January 8, 2013 (CST)


 * Yep, Wiki was established in 2004 I guess, your farm came into existence in ... --&#91;&#91;kgh&#93;&#93; (talk) 09:59, January 8, 2013 (CST)


 * Farms now exist! Farm:Thingelstad.com I decided to not automatically create the farm pages. Thingles (talk) 20:02, January 8, 2013 (CST)


 * Not to do it automatically will not be a problem since there are only two hands full of farms around anyway. A great thing to have them. --&#91;&#91;kgh&#93;&#93; (talk) 14:19, January 9, 2013 (CST)


 * I fluffed up the templates and forms for farms. :) Cheers --&#91;&#91;kgh&#93;&#93; (talk) 16:29, January 10, 2013 (CST)


 * Thanks — looks awesome! Thingles (talk) 21:57, January 11, 2013 (CST)


 * And thank you for rectifying the shortcomings of my changes! --&#91;&#91;kgh&#93;&#93; (talk) 09:30, January 12, 2013 (CST)

✅

Bulk Loading
Check out Bulk import websites. Clearly the quickest way to add a bunch of sites, if so desired. I used it to import 13 websites tonight as a test, just pulling one of the groups from S23 WikiStats. Worked easy. Thingles (talk) 21:09, January 8, 2013 (CST)


 * That's definitely the way to go. The cool thing is, that Data Transfer recognises existing pages and does not import if instructed. So, no typos in titles please. :) --&#91;&#91;kgh&#93;&#93; (talk) 14:18, January 9, 2013 (CST)

Redid Main Page
Took a first attempt at a total revamp of the Main Page. Decided to try using the article count as a grouping mechanism. Step in the right direction... Thingles (talk) 21:57, January 11, 2013 (CST)


 * Definitely a good change. :) I just adjusted the format of the general stats section. Probaby some of it may go into a separate class. I moved the bot stats out since this is not interesting at all for regular visitor. Also they are unrelated to sites and farms anyway. Please move it in again in case to think it should be there. --&#91;&#91;kgh&#93;&#93; (talk) 09:33, January 12, 2013 (CST)

Lot's of sites!
Wow — you are adding sites like crazy! That's awesome! Thanks! Thingles (talk) 09:22, January 13, 2013 (CST)


 * Yeah, but for the time being only sites of which I believe that they are not properly listed elsewhere, e.g. Biowikifarm wikis. The rest is for Bumble Bee. :) Cheers --&#91;&#91;kgh&#93;&#93; (talk) 09:25, January 13, 2013 (CST)


 * I added the Plantnet farm under the biowiki farm. I like the ability to have farms inside of farms. :-) I actually think for large imports I'm gonna just do a CSV bulk import rather than a bot. Main area to use a bot would be to maintain synchronization, but if it's just a one time load a CSV will be easier. Thingles (talk) 09:57, January 13, 2013 (CST)


 * Hey, that's what I kept for tomorrow to do. :) Ah, I remember that we already talked about the CSV import. You are right, this is what I meant. A hundred a day from the MediaWiki section (unaffiliated wikis)? --&#91;&#91;kgh&#93;&#93; (talk) 10:09, January 13, 2013 (CST)

Page Protections Added
I put in place some protection levels around the existing Project:Editors and Project:Operators groups. I added protection to the common forms and templates, as well as Main Page. You should be fine editing, but let me know if you see any issues. See Special:ProtectedPages for details. This is the first time I've setup custom protection levels. Was really happy to find this $wgRestrictionLevels 101 post to figure it out. Thingles (talk) 14:41, January 13, 2013 (CST)


 * Admittedly I have never used this parameter so far. However, it is great to know about it. I added the link to the tutorial to the talk page of mw.o. :) In case there is a problem I will drop you a note but everything seems to be ok. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 15:35, January 13, 2013 (CST)

Default timezone to UTC?
Right now WikiApiary is sharing the default timezone that all my Farm:Thingelstad.com wikis share. I’m thinking I should switch it to UTC. Agree? Thingles (talk) 16:38, January 13, 2013 (CST)


 * Strong support. Since the focus of this wiki is global ... --&#91;&#91;kgh&#93;&#93; (talk) 16:53, January 13, 2013 (CST)


 * Thought this would be easy, but sure MediaWiki is insisting on remembering "America/Chicago" timezone. Still working on it. Thingles (talk) 19:42, 13 January 2013 (CST)


 * Think I got it now! Had to explicitly do some stuff differently given the farm config I use. Thingles (talk) 01:50, 14 January 2013 (UTC)


 * It's also worth noting that all the data collected on wikis has always been in UTC, so moving the wiki to UTC was definitely needed. Pretty sure it's all good now. Thingles (talk) 02:43, 14 January 2013 (UTC)


 * Looks ok to me. Thank you. :) --&#91;&#91;kgh&#93;&#93; (talk) 20:09, 14 January 2013 (UTC)

✅

Wikicafe/General
I saw you edited Wikicafe/General. That edit will get overwritten on the next update. Was there an issue with the data? It looked like you added data that Bumble Bee wasn't able to see? Thingles (talk) 13:52, 14 January 2013 (UTC)


 * That's true, the data I added were not detected for some reason. Since I believe that the data collected is pretty static and do not think that Bumble Bee will update this page often I thought I give it try. Obviously the data will get lost on the next update and I am well aware of it. Since it is the biggest semantic wiki I will have a closer look at it in future anyway, now that it is here. Cheer --&#91;&#91;kgh&#93;&#93; (talk) 15:15, 14 January 2013 (UTC)


 * Hey, Bumble Bee tricked me. It updates as soon as there is a difference between the data collected and the data saved. :) --&#91;&#91;kgh&#93;&#93; (talk) 09:22, 15 January 2013 (UTC)


 * Bumble Bee is actually even dumber than that. Everynight Bumble Bee gets the list of extensions, builds a page of wikitext to describe that and saves it to the wiki. MediaWiki is the one that is smart enough to say "Hey, that's the same thing I have now. I'll accept this as a  event and not put anything in the history." So, anytime there is a difference between the page and Bumble Bee the change will be blown away within 24 hours. Thingles (talk) 17:16, 15 January 2013 (UTC)


 * Ouch, having been tricked by a dumb bee is not nice to read. ;) And I thought it would check by itself if there was a change in comparison to the day before. My thinking was of that of a truly much more complicated solution. I tend to be like this. This lesson is learned. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 17:48, 15 January 2013 (UTC)

Monitored by
Thought you might find Monitored by fun. Way for sites to spread the word! Farm:Thingelstad.com has this on all wikis. :-) Thingles (talk) 03:25, 16 January 2013 (UTC)


 * I truly do and I will add it to my websites soon there possible. By the way, which software do you use to create it? Cheers --&#91;&#91;kgh&#93;&#93; (talk) 09:58, 16 January 2013 (UTC)


 * I used Acorn to create the layered image. And then crushed another 30% of bits out of the final PNG with ImageOptim. I'm no designer, but I think it came out decent enough. Thingles (talk) 13:07, 16 January 2013 (UTC)


 * It looks very professional to me, and probably only a few will see the need for improvement if at all. I was asking because I am not quite happy with mine. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 17:20, 16 January 2013 (UTC) PS Admittedly, the last time I worked with an Apple machine was in '88 on an Apple II. I am sure there is somebody nearby using an Apple.


 * I just realised that my little sister works with an apple. I just transcluded the new monitored by on CAcert in Berlin. I think I am the first one to do this apart from wikis on the Thingelstad.com farm Cheers --&#91;&#91;kgh&#93;&#93; (talk) 23:52, 20 January 2013 (UTC)

Released WikiApiary to Semantic MediaWiki Users
You'll see I sent an email to the Semantic MediaWiki Users mailing list introducing WikiApiary! I wasn't 100%, but I wrote it and hit send before I could change my mind. :-) If hundreds and hundreds of sites get setup, I may have to scramble. But we'll see. I put a special thank you to you in the email — thank you! :-) Damn.. looks like it stripped all my links out of the email. Thingles (talk) 04:57, 16 January 2013 (UTC)


 * Yeah, I read it and would like to say thank you that you appreciate my help on your project. This was very kind of you. Thanks a bunch! Cheers --&#91;&#91;kgh&#93;&#93; (talk) 17:15, 16 January 2013 (UTC) PS I am keen to see the momentum it will be picking up from the SMW Community. PPS Before making it public to a much wider community still some work has to be done, I guess.


 * It looks like WikiApiary picked up a few websites directly from the SMW mailing list, and got on peoples radar. There wasn't any huge spike in traffic or usage. I did see in Piwik that a number of people spent more time than usual on the site. I agree that there is more work and cleaning to do before spreading the word further. My main concern is getting the new version of the bots in place. Do you have a list of things that you see as being important next steps? Maybe we should put a short list on Project:Roadmap? I'd like to get the word to the broader MediaWiki users mailing list next. Thingles (talk) 13:24, 17 January 2013 (UTC)


 * The list is subscribed by about 500 people and judging from the traffic it gets, it will probably still take some time until all have seen the announcement. Admittedly I had expected a bit more visible enthusiasm, but trust me the bee will collect them. The most important thing next to what you already stated is to work a bit on the look and feel of the website. I have done some changes to main page or to the forms but I still think that this is preliminary. I think I dropped some ideas about it. The other thing is to get the Statistics page in order. Now you have to do a lot of scrolling and the load time is not the best either since the rendering of the graphs take their time. Probably it should be split into two pages (standalone and farm). Also the other main pages need a bit of work. So far I have mostly done some changes to the extension's main. To cut a long story short. Let's work on the roadmap. --&#91;&#91;kgh&#93;&#93; (talk) 22:17, 17 January 2013 (UTC)


 * Related, would you like an account on my Piwik server so you can see traffic for WikiApiary? And/or, would you like to get a weekly report of traffic on WikiApiary? Feel free to say no, just wanted to offer since you have been very active. Thingles (talk) 13:24, 17 January 2013 (UTC)


 * I am curious, so why not? :) I believe I already subscribed to the WikiApiary weekly traffic report. Cheers --&#91;&#91;kgh&#93;&#93; (talk) 22:17, 17 January 2013 (UTC)


 * Great. Details just sent in email. Thingles (talk) 13:34, 18 January 2013 (UTC)

Extension pages have more data
I just modified the Bumble Bee script that writes the  pages to add three additional fields that are available in the API response that I was previously just ignoring. This diff shows the new data. I have not modified Template:Extension in use to pick up these new fields or set additional properties for them, yet. Thingles (talk) 13:20, 17 January 2013 (UTC)


 * Yeah, this is good. I believe we should now split up "tags" into "type" and "author" which would already have made sense earlier. I would rather not populate the extensions pages automatically but doing an import with the extra data once information about an extension is available. Will it be possible to have a spreadsheet with all extensions data so I can work on it for initial import. Except for the version and to some extent the author these data are pretty static. So automatic changes should only be done for the version if possible. The way authors are provided is very inconsistent and does not look very professional. --&#91;&#91;kgh&#93;&#93; (talk) 16:29, 17 January 2013 (UTC)

✅

Talk:BrewWiki.com
Take a peek at the debugging I did on Talk:BrewWiki.com. Thingles (talk) 05:38, 19 January 2013 (UTC)

Collection errors on talk pages
You'll see that User:Bumble Bee started to record errors in collecting statistics to Talk pages for sites (Talk:29C3 Public Wiki‎, Talk:Heckipedia‎). I thought about writing a Template so that it could be styled in the wiki, but opted for just writing the exception text out full. I got worried the exception may contain special wikitext characters. But, a template may be the better way to go. I also changed the edits to bot edits after the first trial runs so it won't fill up Special:RecentChanges. Thingles (talk) 22:14, 20 January 2013 (UTC)


 * Hmm, even just seeing a few entries on Talk:BrewWiki.com I'm not liking how the current format I'm logging these errors in pollutes the TOC. Open to suggestions on how to format the error notices. I think ultimately the TOC should only be populated by human conversation, and bot errors should be recorded inline in as they occur, but should be visually separate and more subtle. Thingles (talk) 22:34, 20 January 2013 (UTC)


 * (Edit conflict) Having the error message like this is is fine. The only thing I wonder about is how frequently Bumble Bee will do this. Probably only after there were successful calls thereafter followed again by a problem. --&#91;&#91;kgh&#93;&#93; (talk) 22:40, 20 January 2013 (UTC)


 * I do not think that we will have a lot of talk on website, farm, or extensions talk pages, so having these infos seems ok to me. These talk pages are more like workbenches for the person concerned. So far most of the talk was made on user pages here and most issues that will come up probably will continue to take place on user pages. Talk on other pages will probably be related to Bumble Bees edits. --&#91;&#91;kgh&#93;&#93; (talk) 22:40, 20 January 2013 (UTC)

Crazy Stuff to get at version info
Check out User:Thingles/Scratch4 for some truly insane stuff working to get a better way to detect version information from extensions. :-) Thingles (talk) 18:53, 21 January 2013 (UTC)


 * WOW! I feel like very stupid when looking at the code. :) --&#91;&#91;kgh&#93;&#93; (talk) 19:29, 21 January 2013 (UTC)

Generated Authors for Extensions
See User:Thingles/Scratch3 to see how we can now get a unique set of authors for extensions automatically. This could be displayed somewhere on Template:Extension. I think there should still be a form for authors that is manually controlled. But this could make it easy to populate that, and serves as a secondary display of all users recorded around the wikisphere. Something similar can be done for Property:Has extension type and Property:Has extension URL. Thingles (talk) 21:22, 21 January 2013 (UTC)


 * To separate type and authors is a very good idea. During the past three weeks I have gone through about 150 extensions. In a lot of cases I have added more than the one available tag, e.g. Semantic MediaWiki because all of them apply next to the standard one provided by the extension itself. In a lot of cases I also amended the extension's description to make it more consistent or understandable at all. In general, as you also will have noticed, the extension registration is quite an awful mess, so working on them will be desirable. I think we could move the data in for all extensions I have not worked on so far, if possible, since the autopopulation would be a regression there. Thus we would have data for all extensions. Because of this it would be great to be able to edit all information bits. The data on the URL, tag and description should be quite solid I believe. Perhaps not automatic updates for them after they were added once. Hmm... --&#91;&#91;kgh&#93;&#93; (talk) 21:59, 21 January 2013 (UTC) PS I will work my way through the ones I have completed so far. So no worry about them. --&#91;&#91;kgh&#93;&#93; (talk) 22:08, 21 January 2013 (UTC) Things are on the roll, I guess. :) --&#91;&#91;kgh&#93;&#93; (talk) 22:19, 21 January 2013 (UTC)


 * I like your focus on a quality set of content for extensions. I think that's good. Just exploring some options, what if we added tabs to the Extension template and had the curated, clean descriptions and such on the first tab and then had a second tab that displayed the harvested information that User:Bumble Bee gets when he pulls? This data would be knowingly less accurate and noisy, but would be very complete with no intervention. Thingles (talk) 23:16, 21 January 2013 (UTC)


 * However, quality makes things harder to maintain. I do not think that having two tabs is a good option since it will perhaps cause confusion and requires a new set of properties. What about doing an #if nothing was added manually to the specific property on the extension's page then #ask for this information as Bumble Bee harvested it and add it to the property? --&#91;&#91;kgh&#93;&#93; (talk) 15:04, 22 January 2013 (UTC) PS When it comes to authors and versions always the information harvested by Bumble Bee should be set, i.e. no manual updates here. --&#91;&#91;kgh&#93;&#93; (talk) 16:10, 22 January 2013 (UTC)

Importing 7,361 wikis
Okay, I've got my eyes squarely on Pavlo's Alive Filtered list of wiki API endpoints. 7,361 wikis. It looks like this is just api.php endpoints.

I've grabbed this file and it will not be hard for me to write a one-time bot that will pull the name of the wiki via the API endpoint and create a minimal Web site template page for the wiki. The more challenging part will be trickling them in, and I'm wondering what you think of a different approach.

First off, Bumble Bee would get crushed as it is with 7000 more websites (I've moved him to running every hour already btw). The new version I think will work fine, but I know the old one wont. How do you feel about importing all 7,361 (minus ones that don't respond to an API request) with Property:Is validated and Property:Is active set to False. Then, over time groups of them are turned on as they are validated and activated, just like they would be if an end user submitted them. I think that would work fine, but it would create a huge list of inactive sites for a while. Thoughts?

I guess the other option is I have a "skip lines" value in the bot and run it repeatedly with ever increasing skip lines, but it seems like the net effect would be the same just more gradual.

Thingles (talk) 02:44, 26 January 2013 (UTC)

Okay, that was quick. I have the bot done. It queries the API URL to get the site name and builds out the Template:Website block. I have saving to the wiki disabled for now. See Project:Pavlo import project for the example run I just did with the first 20 records. I'm not doing any checking to make sure it doesn't overwrite one of the existing websites. Not sure how critical that would be. This runs as Audit Bee.

Thingles (talk) 03:09, 26 January 2013 (UTC)


 * Yep, this can be done to get things going. What about doing a first batch with 700 pages to see what happens? Hmm, we really could use a second or third operator. --&#91;&#91;kgh&#93;&#93; (talk) 10:12, 26 January 2013 (UTC)


 * Significant progress, see Pavlo import project‎. Thingles (talk) 14:19, 26 January 2013 (UTC)


 * I am getting more and more excited. The automated import of the logos is really a big relieve and was the biggest speed breaker. Great --&#91;&#91;kgh&#93;&#93; (talk) 14:52, 26 January 2013 (UTC)


 * I see you making a lot of edits. Note that Audit Bee (as it is) will overwrite those and require an undo. I'm feeling like we are pretty good to go and should just start the import to bring them all in? Thingles (talk) 15:25, 26 January 2013 (UTC)


 * Why so? Will Audit Bee start at the beginning again? --&#91;&#91;kgh&#93;&#93; (talk) 15:28, 26 January 2013 (UTC)


 * Good point... I can just skip the first 100 lines. :-) Duh. Thingles (talk) 15:29, 26 January 2013 (UTC)


 * I am relieved. I thought you might be doing batches at 100 anyway. --&#91;&#91;kgh&#93;&#93; (talk) 15:32, 26 January 2013 (UTC)


 * I have gone trough some imported wikis now. I guess I will be able to work at 40 wikis a day, so about 180 days to go. --&#91;&#91;kgh&#93;&#93; (talk) 18:22, 26 January 2013 (UTC)


 * :-) Don't go too fast, I'm worried about Bumble Bee getting overloaded before I have the new version ready. Also, I added some very helpful links and buttons to the verify and activated stuff. Added an autoedit form link too. Much nicer. The main thing is the links for the API calls Bumble Bee depends on are there so they can be verified for real before activating and then generating errors. Thingles (talk) 20:38, 26 January 2013 (UTC)


 * Bumble Bee tricked me, so this is my late revenge. :) Still about 40 quality pages a day not causing problems should be ok? Moving the info into separate templates and allowing direct edit is great. The latter more for further than for initial page maintenance but this is very ok. I will add hints regarding the required minimum version for the different API modules as it is already done for SMW. This will increase the quality of my edits. --&#91;&#91;kgh&#93;&#93; (talk) 20:52, 26 January 2013 (UTC)

Image uploading for Project:Pavlo import project
Seeing how many images have been uploaded today for this massive import, I'm so happy I took the extra hour to figure out how to automate that, even if it's far from perfect. Wow. Thingles (talk) 05:09, 27 January 2013 (UTC)


 * This is indeed a big time relief. The other very good idea was to provided the links for the API calls. This makes things much easier. This extra hour was a very good investment! Lot's a kudos for ya. --&#91;&#91;kgh&#93;&#93; (talk) 09:48, 27 January 2013 (UTC)

Also, when this is done the data set in WikiApiary is going to be much cleaner than others I've seen. A lot of dead websites in that list, about 30%, and a lot of duplicates. Thingles (talk) 05:11, 27 January 2013 (UTC)


 * Yep, but more importantly it seems much easier here to keep the data sane. So we have build-in data sustainability. --&#91;&#91;kgh&#93;&#93; (talk) 09:48, 27 January 2013 (UTC)

Redirects
This made me LOL foks, have you ever heard of permanent redirects, that's spamming, keeping the first version. :-) Thingles (talk) 13:35, 27 January 2013 (UTC)


 * Yeah, I had a jolly good one, too. :) --&#91;&#91;kgh&#93;&#93; (talk) 13:38, 27 January 2013 (UTC)


 * That's cool and must have a meaning. Bumble Bee just created the page for the FindSpam extension. :) --&#91;&#91;kgh&#93;&#93; (talk) 13:39, 27 January 2013 (UTC)