Showing posts with label papers. Show all posts
Showing posts with label papers. Show all posts

Thursday, February 20, 2014

Conference papers

I've just finished reviewing a pile of papers for some upcoming conference: IJCNN 2014, and EAIS 2014. While these papers represent some good work, in some cases their presentation leaves a lot to be desired. After spending my post-doc career in ecology, I have come to the conclusion that authors in our community could learn something from the way papers are written in other sciences.

In the sciences, a paper has five-six sections: the abstract; introduction; methods; results; discussion; and sometimes conclusions. Each of these sections has a specific function, and these functions fit papers in computational intelligence just as well as other sciences. Now, computer science in general and computational intelligence in particular is fairly unusual among academic disciplines in that we write full papers for conferences, and give conference papers almost as much weight as journal articles. But this structure evolved to make the contents of papers easier to understand, so it is just as applicable to conference papers as it is to journal articles. The difference is that conference papers are mostly preliminary work, and are shorter, whereas journal articles are longer and report more complete work.

Firstly, the abstract. This is not just a slapped-on piece of text that kinda-sorta says what you did. The abstract is where you summarise the entire paper: what you did, why you did it, what you found. The abstract is the hook by which you draw the reader into your paper, so a bad abstract means people won't read (and cite!) your paper later on.

The introduction sets the scene for your paper: this is where you survey the relevant literature (including all of the introduction stuffing that seems to account for about half of most people's citation count), establish what the problem is, establish what has already been done, and say what you are going to do. If you have any hypotheses or research questions, this is where you lay them out. And every paper should be investigating some hypothesis or research question, even if you don't explicitly state it. The last part of the introduction is where you are setting out for your reader exactly what it is you are trying to achieve in your paper: the earlier parts of the introduction are where you set out why you're doing it.

The methods is where you describe what you did. If you are describing a new technique or algorithm, describe it here, then describe how you evaluated it. If you are using computational intelligence to approach a real-world problem, then describe how you did this. The methods should have enough detail that someone could replicate your work, if they had access to the same data as you did. Don't needlessly repeat well-known algorithms here: I'm quite sick of reviewing conference papers that describe a simple genetic algorithm in their methods section. I know how a simple genetic algorithm works, and so does 99% of the people who are likely to read that paper. If it's well-established, just say which algorithm you used and reference it, that's what references are for.

In the results section you report your results. Since this is a conference paper, you need to focus on the key results. Don't fill half a page or more with a table of numbers! Especially don't give your tables of results captions like "Table of results" - most of your audience will have at least a functional level of reading comprehension, and therefore will already know that they're results. The caption of a table should describe the contents of the table, especially what each column / row heading means, and what the numbers represent. Captions are supposed to be independent of the text, that is,a reader should be able to understand the contents of the table without having to read the entire paper. A large collection of numbers is hard to understand, so in a conference paper it is often better to use a graphical representation of the results than a table. The text of the results describe the results but does not interpret them, so results sections can be quite short. You can describe any analyses of results you did in the results section, but that should probably be left to the discussion.

The discussion in many ways mirrors the introduction, because you are interpreting your results in the context of the literature you cited in the introduction. You are also answering your research questions, identifying any potential shortcomings in your approach, and suggesting future lines of research.In other words, the discussion is where you bring together all of the other sections of your paper. A well-written discussion eliminates the need for a conclusion.

Write succinctly, don't spend a lot of time saying something that can be explained by a reference. When I started writing conference papers in the mid-late 90s, they were limited to four-six pages because conference proceedings were all on paper. Now, conference proceedings are on DVD the page limits tend to be longer, closer to eight pages. But this is the limit, not the recommended number of pages. It's like the speed limit on the roads, the speed limit is the fastest you are allowed to drive, not the minimum speed you should be driving at all times. Just as you adjust your driving speed to the conditions, you should adjust the length of your paper to the material you are presenting: it is better to produce a succinctly-written, clear and to-the-point four page conference paper than it is produce an eight page paper that covers the same work but buries the important points among pages of padding.

Every sentence in a conference papers needs to tell a story, if a sentence doesn't contribute something to the paper, take it out. Avoid common grammatical errors, and don't rely on a spell-checker. Spell-checkers only tell you if a word is incorrectly spelt, it doesn't tell you if it is the wrong word for that sentence (I've seen  quite enough instances of a "pubic announcement", which sounds far ruder than what I assume they meant, which was a "public announcement"). Proof-read the paper at least twice, and if English is not your first language, for goodness sake get a native English speaker to read it. English grammar is bad enough for us native speakers, it has so many traps in it (especially with things like past / present / future tense) that errors are almost inevitable. Grammatical errors jar the reader out of the flow of the work, and if that happens often enough they will lose the thread of the paper and not understand what you are trying to communicate.

Researchers in computational intelligence can do good research, and are able to write good software. There is no reason they should not be able to write good conference papers.


Thursday, November 1, 2012

Reminder: paper submission deadline for IJCNN 2013

A reminder that the deadline for submitting papers to the IEEE International Joint Conference on Neural Networks (IJCNN) 2013 is February 1, 2013. This conference will be held in Dallas, Texas, August 4-9, 2013.

Friday, June 29, 2012

Google Brain

The so-called Google Brain has been in the news the last couple of days (see for example here, here and here, and see here for coverage from the ABC, including part of an interview I did with them on the subject).

The news coverage has focussed on how the machine learned to recognise cats, because cats are cute. Reading the original paper gives a more nuanced view of the technology. The researchers constructed a colossal neural network with one billion neurons, implemented over 1000 16-processor servers. They then presented it with ten million images taken from YouTube, and left it to train for three days before looking to see what it had learned.

The researchers knew that the three most common images on YouTube were cats, human faces, and human bodies. So, they presented images drawn from independent data sets (that is, data sets that were not involved in training the network) that were known to be of cats, faces or bodies, and saw which parts of the network activated. By examining the activations within the network, they found that there were prototypes of cats, faces and bodies within the network. That is, they showed that the network had formed its own exemplars of these objects.

There are four main technical innovations in this paper:

1) The size of the network, which had one billion artificial neurons.

2) The technique they used to reduce the interconnections between the elements of the network, to make it easier to execute in parallel across the 16 000 processors they used.

3) The number of images (ten million) used to train the network.

4) The size of the images used (200 x 200 pixels, which is larger than most).

The network did not learn to recognise cats, faces, or bodies. It still doesn't know what a cat is, or what a face is, or what a human body is. It has no concept of what the images represent. But even so, it still has potential: neural networks have finally reached the age of Big Data.

Wednesday, March 28, 2012

Scientific Writing

Adam Ruben has a written a rather tongue-in-cheek essay on How to Write Like a Scientist. He asks why can't we scientists write the way other people write? Why are scientific papers written the way they are?

Scientific papers are written the way the are because of their purpose. The whole point of a paper is to describe what the authors did and what they found, and to communicate this as widely as possible, to readers who may not have English as a first language, or who may be approaching the paper from a different field. If papers are going to do this effectively, they have to be unambiguous. The problem with being unambiguous in English is that there are so many words that have the same, or nearly the same, meaning - only, lone, sole, and so on.

Papers use the past tense because you are describing what you have done, not what you are doing or what you will do. Papers have used the passive voice for a long time, but I've noticed a change to active voice recently, and I'm trying to move to active voice in my own writing (my current supervisor once admonished me with "No-one in my lab uses the passive voice!"). The same thing applies to using "we" or "the author" instead of "I" - it's been the fashion to not use "I", but that's changing. If the work was done by more than one person, then it's entirely appropriate to use "we".

I especially liked his comments about the use of "obviously" and I'll admit I've used it a few times myself. Not to demonstrate my intellectual superiority, but to forestall comments from reviewers: the times when I've not inserted a phrase like "obviously, ovens can be hot"*, at least one reviewer has pointed it out.

The use of idioms should be avoided. Idioms can be highly specific to a certain culture. For example, Australians and New Zealanders both speak English. Also, the New Zealand accent is close enough to the Australian accent that most of the time, when I speak, I can pass for a local. The one thing that gives me away as a New Zealander in Australia is the idioms I use: I use New Zealand idioms that just aren't used in Australia. Now, imagine I used those idioms in a paper read by people all over the world: how many people would understand it?**

I've touched on this issues in a previous post, as well as common grammatical errors to avoid and ten rules for good writing. I think papers can be made more accessible without losing clarity, but it's going to take time, and a lot more work from authors.




*This refers to a sign on the oven in the tea room shared by the IT staff at Lincoln University: "Warning: Oven may be hot". Other signs in the area read "Warning: fridge may be cold" and "Warning: floor".

**That said, I have recently used the term "munted" in a paper - look it up if you want to know what it means!

Wednesday, February 22, 2012

Using MLP to model the distribution of bacterial crop diseases

A new paper I co-authored with Sue Worner at Lincoln University is now available and describes how we used MLP to model the global distribution of bacterial crop diseases.

We had data on the presence or absence of certain species of bacterial crop disease (that is, bacteria that infect and cause diseases in plants we use as crops) in 459 geo-political regions throughout the world. We also had data on the climate in these regions and the presence of host plant species. We created MLP that predicted the presence or absence of the bacteria species from climate (abiotic factors). We also created MLP that predicted the presence or absence of the bacteria species from the host plant species assemblages (biotic factors). While both of these approaches worked, we got much better accuracies by combining the outputs into ensembles, and by using a cascaded or tandem ANN approach.

Ensembles are a way of combining the outputs of several ANN. An input vector is propagated through each of the ANN, and the output values combined either statistically (the final output value is the max, mean or median of the uncombined outputs) or algorithmically (output is determined as a majority vote of the uncombined outputs - that is, if the majority of the values is above a threshold, the output of the ensemble is a presence, otherwise, absence). We looked at three different kinds of ensemble: firstly, ensembles of the best ten MLP trained on abiotic inputs; secondly, ensembles of the best ten MLP trained on biotic inputs; and finally, ensembles that combined the best ten MLP trained on abiotic input as well as the best ten MLP trained on biotic inputs. These last ensembles were particularly interesting, as it allowed us to make predictions of species distributions using both biotic and abiotic factors simultaneously. The rationale behind ensembles is that different MLP learn different parts of the problem space: by combining the outputs of several MLP, it is possible to cover a larger part of the problem space, and therefore to boost prediction accuracy. Combining abiotic and biotic factors is the same idea. We know that an organism is affected by both of these factors, so combining both of them allows us to make more accurate predictions.

While the ensemble approach boosted the prediction accuracies, we thought we could do better, so we created MLP that took as inputs the outputs of the very best MLP trained on abiotic and biotic factors. In other words, the outputs of the climate and host networks were used as the input values for a second-level of MLP, which were then trained on the presence and absence of the bacteria species. The idea behind tandem ANN is that, if a first-level network makes a mistake - that is, if a climate or host MLP makes an incorrect prediction - then the tandem network can learn to correct it. Again, we were combining abiotic and biotic factors to make predictions.

The results of all these techniques were that while the single-level MLP were able to predict the distributions of the crop diseases fairly well, combining abiotic and biotic factors gave much better accuracy, whether the combination was achieved by a simple ensemble approach, or by using a tandem MLP approach.

This paper is published in Computational Ecology and Software, an open-access journal. Given my previous posts extolling the virtues of open access journals (see here, here, here and here) I'm putting my academic money where my mouth is, and submitting to open-access journals.

Monday, January 30, 2012

Publishing or perishing

The single most important metric by which an academic is judged is their peer-reviewed publication record. Promotion, grant applications, and finding new jobs all depends on having a strong publication record. This has long been described as "Publish or perish", because if you don't publish, you perish - either you don't advance in your job, you can't find a job, or you don't keep your job. Now, I've got a reasonably long publication record, but I'm always looking for ways of boosting my research output (see my previous posts on publishing in computational intelligence).

Several years ago, biologist Phil Clapham published an excellent essay on the need for academics to publish their research. One of his rules, that I am applying to my own work, is to have at least one paper under review at all times. Now, this can be pretty hard work - there is always variation in the amount of time that papers spend in review, and reviewer comments can take a long time to address. But, it does lead to building up your publication record quite quickly.

One outcome of this rule, though, is that one should also be writing at least one paper at any given time, while also generating sufficient publishable results for at least one paper at any given time. That is, while at least one paper is in review, you need to be writing at least one paper, and also writing code / designing experiments / performing analyses to go into at least one other paper. So, publishing is like being on a treadmill: as soon as you submit one paper, you need to get to work on getting the next submitted, while lining up the material for the one after that.

While this does encourage the practice of breaking research projects into small, easily published (and easily understood) chunks, I suspect it may also encourage further proliferation of single publon papers. Whether or not this is a bad thing, I've leave to you to decide.

Another way to boost your publication count is to collaborate. A lot. Computational intelligence is a particularly useful field to come from for collaborating, as the algorithms we study can be applied to so many problems. But that's all a topic for another post.

Saturday, May 28, 2011

Fuzzy Markup Language

Giovanni Acampora describes the Fuzzy Markup Language (FML) in a series of articles. FML is a XML-based method for describing fuzzy logic systems. Fields in the schema specify the fuzzy knowledge base, which consists of the fuzzy variables and their membership functions, and the fuzzy rule base. The schema also allows for the specification of the inference and defuzzification method to use, and the type of fuzzy system (Zadeh-Mamdani or Takagi-Sugeno-Kang). Finally, it supports distributed fuzzy rule systems, that is, the user can specify the IP address of machines on which parts of the fuzzy system should run.

The major advantage of using XML to describe a fuzzy system is interoperability. All that is needed to read an XML file is the appropriate schema for that file, and an XML parser. This makes it much easier to exchange fuzzy systems between software: for example, an application could extract fuzzy rules from a neural network (like the EFuNN and SECoS rule extraction algorithms that exist) which could then be read directly into a fuzzy inference engine or uploaded into a fuzzy controller. Also, with technologies like XSLT, it is possible to compile the FML into the programming language of your choice, ready for embedding into whatever application you please.

Although Acampora's motivation for developing FML seems to be to develop embedded fuzzy controllers for ambient intelligence applications, FML could be a real boon for developers of fuzzy rule extraction algorithms: from my own experience during my PhD, I know that having to design a file format and implement the appropriate parsers for rule extraction and fuzzy inference engines can be a real pain, taking as much time as implementing the rule extraction algorithm itself. I would much rather have used something like FML for my work.

Such standard, XML-based file formats would be useful for other areas of computational intelligence: a standard XML format for ANN, for example, would be fairly simple to implement and also very useful. I could imagine, for example, training a MLP, saving it in an XML-based format, then using XSLT to transform it to C++ and uploading it into an embedded controller. Conventional, static-architecture ANN like perceptrons, MLP, or SOM could easily be represented in XML.

I will be watching for further developments in this area of technology: I've had quite enough of designing my own file formats!

Wednesday, May 18, 2011

Modelling distribution of jellyfish with ANN

A new paper first-authored by David Pontin, my ex-PhD student from Lincoln University. This describes how he used MLP to model to presence and absence of a species of stinging jellyfish (Physalia physalis) at New Zealand beaches.

There are a couple of interesting points about this paper. Firstly, because there have been no surveys of Physalia distribution, a surrogate data set was used. This data set was stings recorded by lifeguards of Surf Lifesaving New Zealand. Since lifeguards treat jellyfish stings, each incident has to be recorded, and Physalia is the only stinging organism in New Zealand waters, a fairly large data set was available as to the presence of these jellyfish. Predictions were made from oceanic variables such as wave height and direction, and wind speed and direction.

Secondly, the data was carefully cleaned: since stings of swimmers was used as the surrogate for Physalia presence, times when there were no swimmers at the beach were excluded from the data set. While this introduced a small missing-not-at-random bias, it also removed a large number of false absences: if an example was recorded as an absence, then it was because there were no stings recorded, not because there was no one in the water.

Thirdly, an analysis of the contributions of each input of the ANN was performed. This showed which of the oceanic variables contributed the most to the presence of Physalia. This analysis indicated that there may be a hitherto unknown spawning ground for this species in the Tasman Sea.

Finally, and this is in many ways the focus of the paper, the contribution analysis of the ANN was compared with the results of input contribution analysis by an evolutionary algorithm.

Overall, this is a nice little paper that neatly sums up David's work and contributes to the understanding of the behaviour of Physalia. This shows how useful computational intelligence is to ecological applications, an area where there is, in my opinion, enormous potential for computational intelligence researchers to make real, meaningful contributions.

Monday, September 27, 2010

Academic publishing

An excellent essay by Phil Clapham on the need for academics to publish their research. One of his rules, that I am trying to apply to my own work, is to have at least one paper under review at any given time.

This means, though, that I should also be writing at least one paper at any given time, while also generating sufficient publishable results for at least one paper at any given time.

While this does encourage the practice of breaking research projects into small, easily published chunks, I suspect it may also encourage further proliferation of single publon papers.

Wednesday, June 23, 2010

New Website on Evolving Connectionist Systems

I've just launched a website on Evolving Connectionist Systems (ECoS). ECoS are a class of constructive neural networks that learn very quickly and that do not suffer from catastrophic forgetting. The website has overviews of several ECoS algorithms, a comprehensive listing of the ECoS literature, and also links to the ECoS Toolbox, which is a collection of Windows command-line tools that implement several ECoS algorithms.

Update: this website is now at http://ecos.watts.net.nz/

Saturday, January 2, 2010

Turing Test for Game Bots

The Turing Test is very well known not only within the CI community but within the general public. Made really simple, a machine is intelligent if a human carrying out a conversation with that machine can't tell if it is a machine. In other words, we think it is intelligent, therefore it is intelligent.

A similar test has been proposed for game bots. It is described as follows:

"Suppose you are playing an interactive video game with some entity. Could you tell, solely from the conduct of the game, whether the other entity was a human player or a bot? If not, then the bot is deemed to have passed the test."

Playing games requires, in my opinion, more intelligence than having a conversation. It requires comprehension of the gaming environment, at least on some level, as well as anticipation of the actions of the player and the formulation and application of strategy. I have a suspicion that true general purpose AI will come from the gaming world. There's even a dedicated journal for it, the IEEE Transactions on Computational Intelligence and AI in Games.

The best part of this is that playing games can be part of your job.

Tuesday, December 29, 2009

Surprise in ANN

A new paper in Neural Networks describes integrating the concept of surprise from information theory with ANN learning. It's an interesting idea that I've only seen once or twice before (a colleague at Otago University has investigated something similar for a different kind of neural network). It also makes sense - things that are surprising to people are more strongly remembered (they stick in your mind, which is the same as learning them well). I'll be looking at integrating this concept into some of my own work.