Jump to content

Wikifunctions:Status updates/2024-07-10

From Wikifunctions
Wikifunctions Status updates Translate

<translate> Abstract Wikipedia via mailing list</translate> <translate> Abstract Wikipedia on IRC</translate> <translate> Wikifunctions on Telegram</translate> <translate> Wikifunctions on Mastodon</translate> <translate> Wikifunctions on Twitter</translate> <translate> Wikifunctions on Facebook</translate> <translate> Wikifunctions on YouTube</translate> <translate> Wikifunctions website</translate> Translate

Type proposals for accessing Lexemes

What is the past tense of the English verb “write”? Since that is an irregular word, Wikifunctions would get the answer wrong for now, using the regular Function, answering “writed”. Wikidata knows that the correct irregular form is “wrote”, but Wikifunctions cannot access Wikidata yet.

A multi-volume Latin dictionary, the Totius Latinitatis Lexicon by Egidio Forcellini (first published in 1771, this is a revised 1858–87 edition), on a table in the main reading room of the University Library of Graz.

One of the goals for this quarter, as presented last week, is to be able for Wikifunctions Functions to access Forms of Lexemes from Wikidata. Once that is in place, we will be able to get the correct Form from Wikidata, whether irregular or regular, and use the regular Functions for the rest.

We have written and published a draft of some example Functions we are aiming to support, and these have guided as in the Types that are necessary to support these Functions:

  1. Lexeme
  2. Lexeme form
  3. Wikidata item
  4. Wikidata statement
  5. Wikidata property

Particularly the latter three will be incomplete representations of Items, Statements, and Properties, as we focus on the parts that we need for the initial access to forms.

We invite you to leave comments and suggest changes to the Types we will need to create and to comment on the document as a whole. Once we have incorporated any comments, we will be creating the necessary Types, which will allow us to work on the changes in the orchestrator that will access Wikidata.

Please have a look at the proposal, and leave comments and improvements.

Typed lists now open beyond Booleans and Strings

Typed lists were not yet fully supporting elements of types other than Booleans and strings. We have extended typed list support to now support all types that we support, which are natural number and integer, sign, Gregorian and Igbo calendar month, and day of the week. In the future, whenever we introduce a new type, typed lists of that type should be automatically available.

Please let us know if you have trouble using typed lists of the newly defined types.

Recording of the Volunteers’ Corner now available on Commons

The recording of the last Volunteer’s Corner is now available as always on Wikimedia Commons. Please, let us know if you have comments!

Recent Changes in the software

Last week, we spent a fair bit of time trying to debug and fix an issue with running Functions (T368892). We think – hope! – that this is Resolved as of Tuesday 2024-07-09, but please let us know if you think things are still broken. The fix involved disabling the recent change that split out the returned meta-data from sub-calls within each Function call. We're looking into how to re-enable it without breaking production.

As part of our work this Quarter, we've continued improving the logs we're making in the back-end services by adjusting our logging utility (T364413); more work to come here soon. We also completed the re-write of the last of our browser test suites, related to connecting and disconnecting Implementations and Tests (T349836), which was a big focus last Quarter.

We've fixed the Function evaluator widget to immediately let you run a Function once an Implementation is connected, rather than needing to refresh the page (T343586). We've re-built the object selector widget, so that when clicking away from the menu after typing into the widget will select an exact matching value if there is one, or restores the widget to the previous state if there isn't, to be consistent with user expectations (T351206).

We've fixed an oversight noticed by User:ScienceD90, and created the Z189/Validator and Z289/Built-in validator objects for Z89/HTML fragments (T368318).

We've added support for a number of new languages: Waali, as Z1405/wlx (T368046); Interslavic, via Z1750/isv-latn and Z1924/isv-cyrl (T366171); Chitonga, as Z1925/toi, and Chiluvale, as Z1926/lue (T368856); Jakaltek, as Z1927/jac (T369095); Kihunde, as Z1928/hke (T369157); Abron, as Z1929/abr (T369464); and Suret (Assyrian Neo-Aramaic), as Z1930/aii. If you're curious, this work is generally triggered by desire in the wider Wikimedia movement to support and use these languages, especially via TranslateWiki.

Function of the Week: Greatest common divisor (Z13612)

The Function of the Week for this week has been suggested by User:Autom. Thank you for the suggestion! If you want to make a suggestion, feel free to make it on the Function of the Week page.

The Euclidean algorithm was probably invented before Euclid, depicted here holding a compass in a painting of about 1474.

In mathematics, the greatest common divisor of two numbers is the greatest number that can divide both numbers without a remainder. This is a long-known mathematical problem, and gives us one of the oldest algorithms named after a person, the Euclidean algorithm, named after Euclid whose description of the algorithm is the oldest we know today. Descriptions of the algorithm that seem to have been developed independently from both India and China are also known.

Greatest common divisor and the Euclidean algorithm are a good example of how to highlight the difference between a Function and an Implementation: the Euclidean algorithm is but one way to calculate the greatest common divisor. It can be also calculated by getting the two lists of the prime factors of the two arguments, and then multiplying all shared primes. And there are many other ways to get to the greatest common divisor. All these different ways to calculate the result can take more or less time.

In Wikifunctions, we currently have four different implementations for the greatest common divisor:

  1. One composition
  2. One in Python
  3. One in JavaScript, with all three implementations so far being based on the Euclidean algorithm
  4. One using the Python standard library, which offers the greatest common divisor directly

The first composition claims to follow the Euclidean algorithm in its name, but doesn’t actually implement it correctly.

We have four tests, the pairs

The first test uncovers that the first implementation is incorrect: the implementation results in 18, but should result in 6. Nevertheless, the composition is connected. I would suggest that we should either disconnect or fix the faulty implementation.

The tests mostly cover edge cases (three of the four). It would probably be a good idea to add more normal cases (to really capture an incorrect implementation such as the composition), but also to cover even more edge cases, such as twice the same number without being 0 or 1.

User:Autom points out that this would be a good opportunity to discuss the possible different speed different algorithms can have, but in the case of Wikifunctions, for now, the difference between programming languages – whether we are using Python or JavaScript or Composition – is far more significant in this case. For now, Python has a higher overhead than JavaScript, and Compositions can in many cases (but not always) be slower than an implementation in code. Accordingly, the system prefers in this case the JavaScript implementation, followed by the two Python implementations, and with the composition trailing. For the first test case, arguably the only test case that is not an edge case, the composition currently takes more than 11 seconds, the Python implementations about 4-5 seconds, and the JavaScript implementation a bit more than one second.

We hope to improve our backend performance considerably for all of these implementations, so that algorithm efficiency plays a much bigger role, but we are not there yet.

Thanks to User:Autom for suggesting this Function!