mtbc: maze C (black-yellow)
In the School of Life Sciences where I work we produce systems like the Image Data Resource which is full of strange pretty pictures acquired from expensive microscopes and used to justify scientific conclusions. There is also some initial proof-of-principle code for reproducing analyses via the IDR Jupyter Hub. OMERO.figure is also rather neat: turn the raw image files acquired from the microscope into figures ready for Adobe Illustrator to put into your paper; information in the figures like timepoints, scalebars, etc. is derived from the metadata encoded by the microscope as it acquired the images.

Modern academic life is highly competitive and journals are far more keen to publish interesting new discoveries, however lucky, so there is great career pressure report the right kinds of findings. Further, many of them turn out to be difficult to impossible to reproduce. Even despite this, my impression is that the kind of research misconduct I have in mind is, at its core, well-intentioned: the results may be a little doctored, or an unusually significant subsample, or whatever, but the researcher does generally believe the hypothesis that they are trying to prove, they are just exaggerating the evidence for it.

I figure that our work stuff might be useful if it helps to encourage a culture of sharing all the raw data and the procedures by which it was analyzed. But, I wonder if this papers over a more fundamental problem: that the people generating the hypotheses are also those testing them. I am amused to be thinking of this as a conflict of interest.

I can see why it happens. The people who have the idea are probably the more enthusiastic about testing it. Maybe not many labs are used to working with those cell lines or protocols or whatever at all so it is not like any lab could just pick up the work. And, even if we had a system where the people who generate hypotheses are separate from those who test them, one can see that there is still scope for mutual back-scratching and the like. One can imagine the specifics of the experimental design would be something of a negotiation between the hypothesizer and the tester.

So, I am not saying that even this pipedream idea of having researchers' hypotheses tested by third parties is a good one even if it were workable. But, I do wonder if there is some related but realistic way in which scientific research could be restructured to make it more trustworthy.
mtbc: maze C (black-yellow)
At work we have a number of server-side unit tests. I find some of these annoying. Server-side we offer various services which to some extent use each other internally. For unit tests there is not a real server running so when a test tries executing some server-side code there must be mock objects that fake appropriate responses from the other parts of the server that they attempt to use.

On the one hand, such unit tests typically run quickly and easily enough that they can be placed quite early in our code quality checks: a problematically failing test can be discovered long before the culprit is merged into a running server and the integration test suite run against it. (The integration tests use a real running server.)

On the other hand, not only are these fake appropriate responses an inferior substitute for the real thing, meaning that the unit tests are perhaps not testing a server that properly corresponds to reality, but as somebody who works on the server internals I find these unit tests a maintenance headache: if I change something about how the server works then I must fix the affected unit tests to fake new values in a new way. That is, I effectively have to correspondingly adjust the sequences of behavior from the fake server.

Times may be changing in relevant ways. Perhaps the computing cost of running integration tests was much larger. )

At my last job we had a less manual approach to providing data for tests. )

I should clarify that I am fine with unit tests in general: I have written various new ones into our codebase but mine do not exercise internal server dependencies enough to require many return values from mock objects to be faked.

Following on from my previous comments about contemporary code quality, surprise test failures suggest the code was not thought through well. )
mtbc: maze C (black-yellow)
For a few years I led technical hiring in a small business. Within reason I had the freedom to run that hiring largely as I wished. Seeing as Americans tend to have detailed college transcripts I typically wanted to see some coding examples and their transcripts for a first interview and by the seat of my pants I would let these and the applicant's answers to questions lead the course of the interview in a very personalized way. For different candidates I had correspondingly different concerns and often some latitude to tailor the position to them after hiring. For some applicants who seemed to be doing passably but without any dazzle I would sometimes extend the interview to give them a few more questions to see if they could manage a late jump above the bar. With one candidate, whom I became very glad to have hired, during their first interview they made a dumb mistake halfway through and seemed to fall apart afterward: I risked offering them a second interview in which they did fine.

Colleagues would tell me that I give tough interviews; they felt sorry for the candidates. This is partly because I wanted to know how far the applicant could go and how well they can still dance a little way out into new territory so I would typically ask enough to find if they knew a bit more or could guess intelligently. It was a good sign if they couldn't answer all of my questions confidently because it meant I was interested enough in them to find what the extent of their ability was. Further, I wanted to know that what was listed for them on paper was actually still somewhat also in their head: if they cram for an examination then forget the material then that is of no use to anybody.

This is not to say that I was perfect at hiring: one time I managed to hire somebody with A grades in a three-course undergraduate series in analog electronics. In later needing to construct some interface hardware for high-speed data acquisition I was alarmed to apparently be introducing them to the concept of an operational amplifier. I also managed to hire a software developer who was clever but unfortunately confidently thought themself to be cleverer than they were. Mostly I did well though.

Last week I was on an interview panel for a position at the university. It was interesting and odd to be working within a formal process imposed by those above me. Especially, each candidate is to receive the same interview, very much down to duration and questions. While it is not how I would run my own business, I can certainly understand the organization's desire to ensure fairness and as their employee I play by their rules. Still, it makes for new challenges: for example, noting what I would have liked to ask each of the diverse candidates then trying to generalize that to questions also useful or applicable with the others. So, we asked what we needed while indeed doing our best to interview the applicants identically. It was a very odd experience for somebody with my recruitment background but I think that it worked well enough.

Still tired

Jun. 3rd, 2017 02:22 pm
mtbc: maze C (black-yellow)
My mood is improved but I am still tired. This morning I stayed in bed until maybe even 9h and at 10h I was still in the armchair, resting my head and yawning. My workout was gentle indeed, barely managing 22 of the machine's calories per minute. I have decided to already scale back my exercising. I now have aching limbs and a growing headache; I have taken acetaminophen with codeine. I have a few things to do this weekend but not many, perhaps few indeed if rain comes.

I thought I might just be tired from our annual users' meeting at work: I was tired indeed on the evenings following the meeting days and I am not naturally extroverted so the social interaction itself may be taxing. But, given my tiredness last week too I wondered if I may even have some low-grade infection or somesuch. Perhaps the meeting was tiring indeed for those who traveled from different time zones, including Portland, Oregon, and Kobe, Japan. Our Brisbane-based colleague was unable to join us for family reasons. Many of our visiting users are interesting and nice.

The users' meeting comes with various free-food perks, including a wine-tasting last Thursday evening. Tastings are not especially my kind of thing but this year I did taste three of the eighteen and that worked out well. There was some kind of full-bodied red I liked, then an unfiltered red I followed it with was a bit sweeter, then I had some Chinese ice wine that was sweeter still. There were also cheeses and I always like cheese.

I have a bit of work to do early next week as leftover help for some visiting users I talked with but nothing too onerous. I enjoy having a bit of variety in my work tasks.
mtbc: maze L (green-white)
I have been quieter lately through feeling tired and irritable. I am usually at my desk by 8h or so and, in fasting, I don't take time for lunch, but I don't want to leave too weirdly early so I usually stay until 16h30 or so. Sometimes I stay later, for example on Tuesday I stayed beyond 17h because, as I was wrapping up, a series of three people came to ask me things. This is okay, I like to be helpful, but it did mean a longer workday.

Unfortunately, I awoke rather prematurely on Wednesday. I do not make much use at home of unexpected morning time because I am distracted by knowing that I have to leave the house so I just head to work unusually early. Over the afternoon I had a migraine, though not quite bad enough to prevent me from working, and on the way home I had a detour via the doctor's office which added maybe a half-hour to my trip, so that became a long day indeed. Yesterday I again awoke well before the alarm and started my workday correspondingly early.

An early workday makes sense for me. )

I have wondered if, though I do not eat at lunchtime, I could instead use the time well for some other personal task. I am not sure what that might be, though. Especially, I like to keep my own work separate from my paid work, on different computers, but I do not want to risk routinely bringing my own laptop into work. I do at least sometimes take a brief walk, as I did today to the mailbox.

Recently my work has been difficult. )

I have also been unusually busy at home. For example, last Saturday one of my children needed to be in Dundee all day and tomorrow they both need to be. So, that is basically a day of the weekend rather perturbed. I thus wonder if everything has added up to push me a little over a tipping point lately.

Feeling myself to be much inclined to utter screw this and let chips fall, I skipped exercising for a couple of days and have instead treated myself a little: for instance, after exercising yesterday I took a bath instead of a quick shower. Today I felt somewhat better. Not only did I sleep for longer but I also felt a little more enthusiastic. I did well in my exercising. Perhaps I am returning to a more tolerable state of mind but I will try not to push it too far.
mtbc: maze C (black-yellow)
Here in the European Union the clocks went forward an hour last night; now we are back at our usual difference from much of the United States. At work our team is rather distributed and we have many online meetings so in advance e-mails we remind participants of the applicable state of affairs. For my own part, for very many years now my .xinitrc has started up an xclock for each of TZ=EST5EDT and TZ=GB. After Brexit I hope that the UK at least continues to match Western European (Summer) Time.

This morning I awoke, saw the clock, and was pleased that I had managed to sleep for a long time, before [personal profile] mst3kmoxie reminded me that the clocks had changed. I like to own clocks that listen for the radio signal so I never have the fuss of manual adjustments. Though, even better would be clocks on wifi given that I have my home router's dhcpd.conf include option ntp-servers … for its responses though I suppose that carries no zone information.
mtbc: maze C (black-yellow)
I attempted to watch a British Computer Society talk on the Refal progamming language, which seems to be of the kind of thing that interests me but it was difficult to tell. The brightness of the projected slides is such that the text on the light background is so washed out as to be unreadable and the audio, perhaps because of reverberation within the room, is also nearly impossible to usefully make out. I have seen this problem also with larger auditoriums: a lapel microphone can pick up audio reasonably but the video of the speaker and their projected slides is a disaster and one must train audience members to delay their question until a microphone has reached them or speakers to repeat the question.

At work our team is rather dispersed so we frequently face this audio/video issue for meetings. Fortunately we can typically share notes and slides directly from computers rather than by videoing projected displays. We have a couple of good microphones in the room but people do have to remember to speak loudly and clearly toward them and for remote listeners it still sounds like the in-room people are speaking from the other end of a large concrete pipe. We also have an annual users' meeting much of whose content would be nice to capture.

In our modern times I would have liked to think that remote meeting attendance is more commonly a solved problem: for example, that the research complex in which I work would have some meeting rooms optimized for it, with the projector also able to provide an electronic copy of what it shows, and for smaller meeting rooms some microphone setup that allows reasonable audio capture without requiring a microphone especially for the current speaker. Perhaps our human senses are more marvellous than I realize given that my in-person experience of meetings and seminars is so much better than the typical remote or after-the-fact attempt. It is possible to offer remote meeting attendance well but it is a rare surprise to find it actually done adequately except for largely one-way presentations wherein recording is done with screen capture from a sole presenter's computer for a wholly remote audience.
mtbc: maze C (black-yellow)
This morning I traveled down to York for the Light Microscopy Facility Managers Meeting. The railway journey was pleasant: not only does the Fife Circle Line see pleasant scenery but, true to its name, the East Coast Main Line shows even more coast. I see the ships and the seaside towns and wonder at all the lives I don't know.

I reflected on my change in perspective since moving to Perthshire. I used to think of York as a cold, windy place very far north. Now, it is surprisingly far south, approaching Manchester's latitude.

Even south of Edinburgh the train was pleasantly empty. Perhaps trains in the north are like the motorways: the congestion is largely south of Leeds, making life in Scotland pleasant indeed.

In speaking with others about microscope vendors I came to wonder if they really don't want to be in the image analysis software business: they care about hardware and feel as if they must bundle proprietary software with it but it is usually not great software and perhaps everybody would save money if they simply pooled resources toward some open-source effort. It is serious microscopes that I am thinking of here, probably costing a few hundred thousand dollars each.

I made a discovery at the hotel about coffee machine user interfaces: apparently when the columns of drink option buttons are above multiple spouts then the coffee is likely to issue from the spout below the corresponding button. This guideline is consistent with the few observations that I made.

Caroling

Dec. 8th, 2016 10:25 pm
mtbc: maze G (black-magenta)
This morning I sang my first Christmas carols of the season at an informal jolly affair in a large indoor area of the research complex in which I work. There was a flutist and mince pies and the coworker I found myself standing beside sang strikingly well indeed. It made for a pleasant break from usual routine. We were joined by small children from the University nursery: they sat in a group and looked quite bemused.
mtbc: maze I (white-red)
I find it interesting how, just occasionally, when I am not at a computer, I realize something quite specific that is probably wrong in code that I wrote, something that relies on precise recall of its details: when I go back to check then it usually transpires that the fear was well-founded and I have something to fix. This suggests that I have cognition occurring that can work on intellectual problems beyond my conscious awareness. Or, perhaps I saw that my code was wrong while I was working on it yet that realization could not immediately come to mind but was instead strangely delayed.

To take a trivial example from yesterday's Java programming, at work we have model objects that all implement the IObject interface that includes accessors for an id property: I correctly guessed while driving the car that I was comparing object x with another object's y.getId() instead of like with like. My code still did not work as hoped after I fixed that mistake but that is a problem that can wait for the new week.
mtbc: maze I (white-red)
For file encryption at home I have always used whole-volume encryption: previously LUKS on Linux, now the softraid crypto on OpenBSD. At work they became very excited about laptop security so, although I am working on opensource projects rather than students' confidential data, I thought that I should at least add some post-installation encryption to my work laptop and the most convenient solution was to use eCryptfs to encrypt my home directory. That way the key isn't stored anywhere on the system and I don't need to type an extra passphrase because, perhaps via PAM, it simply uses the password I already type to log in.

Overall I have found eCryptfs quite workable. My first build of the day takes twice as long but that's okay as once the system has got going performance seems fine. For schroot's fstab I needed to switch the mount of /home from bind to rbind because of how eCryptfs uses a mount to /home/mtbc once I log in. I suppose that for backups at work I could now just back up the encrypted view of my home directory but my backup script instead tars up the plaintext and runs it through gpg on the way to a network drive. I already need gpg at work anyway for tasks like signing releases.
mtbc: maze C (black-yellow)
Yesterday was the annual symposium of our Gene Regulation and Expression division. For me it had some amusing aspects: for instance, when I see mention of DMSO (dimethyl sulfoxide, used as a solvent and in DNA sequencing) I am reminded of DHMO (dihydrogen monoxide) and for the Hallowe'en-themed baking competition the cakes bore names like Mouse Dissection.

I got to thinking about how our group's microscopy image management software OMERO might be generally useful in the life sciences. Broadly, we allow organization and annotation of one's images, with easy plugins for one's own bulk analysis scripts and suchlike. It occurs to me that the ability to mine data out of many images may allow one to vary the organism (say, knock out genes) and its environment and observe changes in protein expression, morphology, etc.: learn the relationships between what one can control and the functional or behavioral effects. Further, given how much biological research is done on organisms where the colonies go through many generations as the individuals have short lifespans and reproduce frequently, I wonder to what extent we enable phenotype-directed evolution for creating designer organisms, if the image analysis can guide which individuals' progeny to favor.

I also wondered about the many talks in which researchers presented what one might regard as being positive results: that some particular thing is shown to have some effect or whatever. One might think that even negative results would be useful information that tell us more than we knew before. I wondered: do the many positive-sounding results reflect educated choice of which experiments to perform or are there perhaps at least as many performed whose outcome seemed more disappointing? I can imagine that an information-theoretic perspective might suggest that a healthy rate of disappointments is entailed by any efficient empirical course of scientific discovery: if we are good at guessing what will work then perhaps we are not gaining much new information from each experiment.

Perhaps there is a weak analogy to be made with funding software projects wherein the less-glamorous but entirely appropriate maintenance of the codebase is quietly done alongside the more attractively fundable addition of new features: maybe one may follow a relatively information-efficient course of scientific research so long as enough positive results come out along the way. One would hope that it is appropriately to one's credit if some of those results are also rather surprising, encouraging the riskier, more information-rich, lines of inquiry.

Update: On reflection, I do now recall that one talk told us that P affects Q and also appears not to affect R, though I don't think we got confidence in the latter quantified.
mtbc: maze C (black-yellow)
At work this morning I left some notes on a Trello card about where I am up to with a current effort on a new feature; it had been some days without my doing more on it than quietly beavering away. I also try to maintain a steady stream of small work products that are more immediately tangible such as opening a relatively trivial GitHub PR before returning home. Back when I wasn't supervised but had an intellectually challenging problem to solve then it might be quite some days before I felt as if I were verifiably making progress but I learned to trust the process and know that the initial stage in which I'd cogitate was both necessary and valuable. Now I am back in a more junior role I find myself again actively providing a steady stream of observable evidence that I am indeed continuing to effect received instruction.
mtbc: maze C (black-yellow)
For computers that I think of as being reliable production systems I tend to run OpenBSD if it's personal and the base installation largely suffices or Debian GNU/Linux otherwise. This is because each offers a series of stable releases (as opposed to a rolling release) to which they promptly backport important security fixes. So, I get to minimize security vulnerabilities while also avoiding many bugs: code only gets into a stable release after it has been tested for some time, in combination with the libraries upon which it depends, meaning that others have usually discovered and fixed the problems already and if they haven't then there are a lot of people who have exactly the same ecosystem running so reproducing bugs and sharing tips is much eased. Also, it is nice not to have to work with myriad packaging systems (CPAN, Hackage, etc. too) where one will do.

At work I write simple Python with much help from the official documentation and Stack Overflow. Already knowing Java, Perl and Haskell was a good starting point. I am not much familiar with the Python community nor with pip, although in terms of package management it hardly seems to be quite dpkg or xbps. Because of the above reasons, and with wanting to keep my work environment reliable, on my work computer I install our Python dependencies via Debian's own packaging of them. I am thus relying mostly on code that is widely used and tested while getting the frequent smaller changes that are truly warranted. I do use xbps-install -u on my personal laptop but if I thus break it a bit then I can live with the partial outage.

For OMERO our more Python-savvy people are moving toward an installation that uses virtualenv and pip that largely installs according to requirements like gunicorn>=19.3 with advice to use pip install --upgrade when upgrading OMERO to get the security updates for such dependencies. This obviously goes against my natural thinking in that different users may then get different combinations of library versions depending on when they upgrade, and possibly fairly infrequently given that OMERO itself can go for a long time without some urgent security release. However, I am not the one with experience in developing and deploying complex Python-based applications.

I thus wonder what is typical in the Python community and among those sysadmins who are responsible for the smooth running of the resulting software: if distribution-provided packaging is generally eschewed in favor of getting latest versions from pip and its ilk. Is there some security announcement list that generally covers pip's package repository? Has many years of Debian unduly conditioned me against whatever the latest released version of something happens to be? Are the main default repositories used by pip curated better than I realize? I don't know, but I'm curious to learn how my thinking is wrong in this case. I may be too conservative or have too little appreciation of the Python community's quality assurance efforts. It is worth noting that at work we have DevOps-style continuous integration servers frequently reinstalling and testing OMERO automatically so it may simply be that I don't give that extra safety net enough credit.

Update: My impression is that for installation on production servers we will largely be advising sysadmins to install the distribution-provided packaging of our software dependencies.
mtbc: maze C (black-yellow)
I work in the University of Dundee's Life Sciences Research Complex in which constituent components have academic staff and students so there are various seminar series. The computational biology ones started up again with a most pleasant surprise from Katarina Blow. If I heard correctly, she is but an undergraduate from Cambridge who joined us for some weeks of summer work, but in explaining her analyses of a simulation related to the Thomson problem she clearly had command of the material and it was among the better seminars I've attended. Although I am employed as a software developer, I am technically research staff and encouraged to use my time with the university as a learning experience.

Those sharing the building complex with me tend to have advanced technical qualifications, access to computers, notebooks, online calendaring, etc. Maybe a week and a half beforehand I had received e-mail advertising the seminar so I noted it accordingly and went along to it: not a difficult task, one would think. I was one of the few who was there by the time it was to start: apparently many others didn't make it on time because the division secretary hadn't e-mailed out a reminder on the same day. Indeed, I often receive several e-mails about the same seminar which is annoying because I check each over in case any of the details (time, venue) have changed. Honestly, are these people really unable to recognize an interesting e-mail the first time and arrange some means of remembering to act on it? It's not impressive, nor are the various e-mails I receive with the subject prefixed PLEASE READ as if I might think that the message had been sent to me for some reason other than that I am intended to read it.
mtbc: maze C (black-yellow)
I have managed projects )

and executed tasks given by others )

I had quite a good workday today. I had the go-ahead to work on a modest task on code with which I already have some familiarity. The aspects that had highest technical risk went relatively smoothly. With luck I'll have finished off this next step early next week. What I will be doing in a week's time though, I don't know.

With my work in Dundee I have a couple of new issues to manage. One is that I have some view into what is needed but it is hardly comprehensive )

The other issue is that we use many technologies for our modest team size )

In short, I sometimes feel a bit at sea, both in terms of my work and others'. I am sure that people would listen if I had ideas about how to improve that but my sense is that there are no easy answers: it may be inevitable given that we try to do a lot in a fast-moving field with limited resources.
mtbc: maze C (black-yellow)
I had intended to attend the Scottish Microscopy Group's annual symposium but it turns out to be on Thanksgiving Day. At least I made it to last year's though in our internal notes it was SMG which I always couldn't help but read as Sarah Michelle Gellar.
mtbc: maze C (black-yellow)
We at the Open Microscopy Environment provide open-source software on GitHub under ome and openmicroscopy with extensive developer documention for both Bio-Formats and OMERO. We distribute our software for free and support it mostly through mailing lists and forums.

Our users often praise our support, suggesting that it exceeds what is typical of commercial paid support. They also, despite often having software development skills themselves, rarely contribute fixes or features: they are much better at reporting issues than addressing them.

This week I was intrigued to hear the suggestion that we don't get anywhere near as many external contributions as one might expect of free open-source projects with a large userbase because we are too good at promptly answering users' questions and actually delivering code changes to address them. It's a generalization of the idea, wait for a day before investigating support requests in case people just figure out the issue themselves.

I don't believe that we have any intention to respond to this hypothesis by becoming less helpful but it is an interesting problem that I had not considered before.
mtbc: maze C (black-yellow)
Much software developed in an academic context is abandoned and rots away after the project finishes and the students move on. My work is mostly on OMERO which does not easily fit into the standard model. Many research groups and academic institutions find our software very useful. The OMERO system running at work for the School of Life Sciences is a long-lived production system and it has possibly a thousand kin around the world, as well as many lesser deployments. OMERO enables research and collaboration well enough to have independent reviewers judge that the project delivers great value.

OMERO needs work on an ongoing basis. We need to make sure that many people can continue to deploy and use it so as new versions of underlying platforms come out, such as operating systems, language implementations, third-party libraries, we must adjust our software to match. Furthermore, microscopy has been advancing at a high rate for many years, in ways both qualitative and quantitative, and to remain useful OMERO must adapt accordingly. We are always playing catch-up but through ongoing effort we remain close enough. To take a simple example, OMERO uses Bio-Formats to read microscope image files but a new version of the microscope software may write the images in a new format and we sometimes first find this out when users report that OMERO suddenly can't read their images.

However, I am classified as research staff even though I don't do any research. As a professional software team delivering and supporting production code I wonder if we are out of place in the university. Although we strongly serve research funding agencies' ultimate goals, software maintenance is not something that they are inclined to fund; they instead seek an exit strategy and opine that the university itself should fund the more mature projects. Back in the real world, this would have research groups around the world diverting significant effort in recreating poorer facsimiles of OMERO. Possibly the agencies would prefer us to become a commercial entity and to fund us indirectly through researchers' overhead in budgets though I suspect that would rather inhibit generous code contributions to our project from the wider community.

I wonder how other important pieces of research infrastructure are funded. People seem to agree it's worth doing and that others should indeed fund it.
mtbc: maze C (black-yellow)
At work our group is represented publicly online through Twitter, Facebook, mailing lists, web forums, etc., and a blog. Recently I have been working on blog entries. I finished one blog entry off today and drafted another on the geometry I had been working on. Wanting to represent matrix algebra in the latter I wondered how best to achieve that.

We use Jekyll to build our blog with kramdown as the markdown syntax. After some experimentation a pleasant discovery has been that I can include MathML in the blog entry and MathJax, a JavaScript library, appears to do a great job of causing it to display well even if the browser does not actually support MathML.

Profile

mtbc: photograph of me (Default)
Mark T. B. Carroll

July 2017

S M T W T F S
       1
234 5678
910 11 12 13 1415
161718 19202122
23242526272829
3031     

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 20th, 2017 04:29 pm
Powered by Dreamwidth Studios