A wise person once told me that doing a PhD is as much a test of endurance as it is a test of intelligence. You face years of late nights, mountains of literature, numerous false-starts and dead-ends, and the nagging fear in the back of your mind that it might not be all that worth it.
Whether or not it is worth it is a question that only you can answer. Financially, it probably isn't. People who do PhDs tend to earn less over their lives than those who do not. They take longer to settle down and tend to delay parenthood and home-ownership until later in life. On the other hand, a PhD is your ticket into academia: the chances of getting a good, stable academic position without a PhD are, now, practically nil. A PhD can also gain you respect from the community: although I seldom use my title, it is useful when I do. Finally, there is the satisfaction of knowing that you achieved something most people never will, or never could. Personally, I did a PhD because I wanted to see if I could. It was the challenge of doing it that appealed to me. With my undergrad degree (first class honours in Information Science) I could have gone into the corporate world and made a very good living. I'd probably be in a high management position now, making a lot more money than I am making as a researcher, but I'd probably be miserable at the same time, because I'd never know how far I could have gone in research. And at the end of my life, I'd be asking myself, how much of a difference did I really make?
There are several factors that contribute to a successful PhD. Firstly, you must have a good supervisor. In fact, I'd go as far to say that you need two supervisors, one senior and one junior. By that I mean that you need one supervisor who is an established academic who is well-respected in their field, and another supervisor who has recently completed their PhD. This is because the junior supervisor still remembers what it is like to do a PhD in the current time, while someone who did their PhD twenty years ago has probably forgotten. You must actively engage with your supervisors, to make sure that they are up-to-date with what you are doing and what you plan to do. A supervisor who is ignorant of what you are doing is a useless supervisor. Don't keep them in the dark!
You must have a clear idea of what your PhD is about. In other words, you must have a hypothesis, and research questions, and research goals. I even went so far as to make these explicit in the introduction to my thesis. It might take you a while to be clear about these, but you'll save a lot more time in the long run.
You must not underestimate the requirements for a PhD. Most universities award a PhD for "a significant original contribution to knowledge" (although most of them do not define "significant" "original" or "contribution"). So, a new algorithm for determining the contributions of the input variables of a neural network probably wouldn't be enough for a PhD, while the algorithm in the context of a rigorous theoretical analysis of the neural network itself, along with an analysis of the algorithm, probably would.
You must not over-estimate the requirement for a PhD. In other words, you're not going to find a cancer cure, or discover the Higgs boson, or bring peace to the Middle East during the course of your PhD. Your PhD research problem needs to be enough for a PhD, and no more. Feature-creep kills PhDs as easily as it kills software projects. From chatting with more senior academics, I've come to believe that this is a more common problem than underestimating a PhD. A good supervisor will help you define the scope of your PhD project, while a bad supervisor will not. Get rid of a bad supervisor and find a better one. Or, at least seek help elsewhere.
You must stick with it. Everyone has a period during their PhD when it all looks hopeless, when you don't want to go on and just want to pack it all in. Hang in there. If you've decided that it's worth it before starting your PhD, it probably is still worth it, even if you don't feel like it. The enormous high you will get when you pass your examination is something you'll not feel often in your life (I found out I had passed my PhD examination two weeks before becoming a father, so I had all of my enormous highs in a short period of time).
It is likely that your examiners will want you to make some revisions to your thesis. Don't take this personally! The best thing to do is to just shut the hell up, make the changes as quickly as you can, and get the degree confirmed. Don't waste too much time arguing with the examiners, unless they are egregiously wrong (one of my examiners was egregiously wrong, in several places, and making the changes he wanted would have made my thesis worse, not better. In the end, I had to show my examination convener a pile of literature that showed that the examiner was wrong, and educate him on how innumerate the examiner was).
When you have passed your PhD exam, the next step is to get a job. If you want to be an academic, that means getting a post-doc. If you're organised, or lucky, then you might even have a post-doc organised before you finish your PhD. Don't restrict your search to just the field you did your PhD in. My PhD was in computational intelligence, but my two post-docs were in ecological informatics, and my current position is in ecological modelling. I'm not an ecologist, by any stretch of the imagination (although I do know a lot about ecology now) but because I am a flexible and fairly clever person I was able to work in these fields, and work effectively. Know what skills you have, and know how to advertise them to potential post-doc supervisors.
Once you're in a post-doc position, the only goal you should have is to publish as many papers as you can, as widely as you can and as quickly as you can. It can also be good to co-supervise some PhD students of your own, to attend conferences, edit journal special issues, and generally show the world that you are a good, hard-working and professional researcher.
But, above all, you must hang in there!
Tuesday, November 8, 2011
Thursday, November 3, 2011
Cargo Cult Statistics
One of the nice things about working in a world-class ecology group is the statistical rigor with which ecologists analyse their results. Unfortunately, this rigor is often missing in computational intelligence. Although I touched on some of these issues in a previous post on Minimum requirements for computational intelligence papers, I recently read an article (that shall remain anonymous) that actually made me groan. While I am starting to notice more papers with repeated trials, and even investigating several parameters, the analysis of these results leave a lot to be desired.
Sometimes it is enough to simply list the mean and standard deviation of your accuracy measures. By itself, the mean is useful as a statistic that represents the population of accuracies that the algorithm yielded. The standard deviation is also good as a measure of spread of the values. But if your standard deviation is large, that needs some comment in the paper on why the algorithm is so variable? This is even more important when comparing different algorithms. An author might for example like to say that a neural network trained with evolutionary programming is better than logistic regression for their application, but if they are seeing a coefficient of variation of more than 60% then that implies that the algorithm is giving highly variable or even inconsistent results. To say that these results show that ANN are better than regression, without any statistical tests for significant differences is simply nonsense.
Even if you do do such tests, you need to make sure that you are using the correct tests. What is the distribution of your results? Are they normally distributed? If they are not normally distributed, then you can't use simple parametric tests of significant differences like t-tests. If you are comparing several groups of numbers then a n-way ANOVA is more appropriate than performing n t-tests. These kinds of comparisons, of several groups of numbers, are very common in computational intelligence (the authors are comparing different algorithms over several data sets, or with different parameterisations) but I can't remember ever seeing a paper that mentioned ANOVA (if you can prove me wrong, please do so in the comments).
I call this kind of shallow statistical analysis Cargo Cult Statistics.The term is inspired by Richard Feynman's famous speech about Cargo Cult Science. In this case, it means that while it looks like the authors are doing a statistical analysis of their results (they are calculating the means and standard deviations) it isn't really so, because they are missing out a huge amount of analysis that might actually tell them something useful about their results.
Now, I'm still learning about statistics (but, I'm still learning about everything, and will be until the day I die). But at least I know to ask someone with a better knowledge of statistics than me for advice on how to analyse my results, and I think it makes my papers much better.
Sometimes it is enough to simply list the mean and standard deviation of your accuracy measures. By itself, the mean is useful as a statistic that represents the population of accuracies that the algorithm yielded. The standard deviation is also good as a measure of spread of the values. But if your standard deviation is large, that needs some comment in the paper on why the algorithm is so variable? This is even more important when comparing different algorithms. An author might for example like to say that a neural network trained with evolutionary programming is better than logistic regression for their application, but if they are seeing a coefficient of variation of more than 60% then that implies that the algorithm is giving highly variable or even inconsistent results. To say that these results show that ANN are better than regression, without any statistical tests for significant differences is simply nonsense.
Even if you do do such tests, you need to make sure that you are using the correct tests. What is the distribution of your results? Are they normally distributed? If they are not normally distributed, then you can't use simple parametric tests of significant differences like t-tests. If you are comparing several groups of numbers then a n-way ANOVA is more appropriate than performing n t-tests. These kinds of comparisons, of several groups of numbers, are very common in computational intelligence (the authors are comparing different algorithms over several data sets, or with different parameterisations) but I can't remember ever seeing a paper that mentioned ANOVA (if you can prove me wrong, please do so in the comments).
I call this kind of shallow statistical analysis Cargo Cult Statistics.The term is inspired by Richard Feynman's famous speech about Cargo Cult Science. In this case, it means that while it looks like the authors are doing a statistical analysis of their results (they are calculating the means and standard deviations) it isn't really so, because they are missing out a huge amount of analysis that might actually tell them something useful about their results.
Now, I'm still learning about statistics (but, I'm still learning about everything, and will be until the day I die). But at least I know to ask someone with a better knowledge of statistics than me for advice on how to analyse my results, and I think it makes my papers much better.
Labels:
research craft
Wednesday, November 2, 2011
Reminder: paper deadline for KES-IIMSS 2012
A reminder that the deadline for submitting papers to the 5th International Conference on Intelligent Interactive Multimedia Systems and Services (KES IIMSS 2012) is 1st December 2011. This conference will be held in Gifu, Japan, 23-25 May 2012, simultaneously with the 4th International Conference on Intelligent Decision Technologies.
Labels:
call for papers,
conferences,
reminder
Tuesday, November 1, 2011
Reminder: paper submission deadline KES-IDT 2012
A reminder that the deadline for submitting papers to the 4th International Conference on Intelligent Decision Technologies (KES-IDT 2012) is 1 December 2011. This conference will be held in Gifu, Japan, 23-25 May, 2012.
Labels:
call for papers,
conferences,
reminder
Monday, October 31, 2011
Reminder: conference paper deadline ICFSNC 2012
A reminder that the deadline for papers submitted to the International Conference on Fuzzy Systems and Neural Computing (ICFSNC) 2012 is 30 November 2011. This conference will be held in Barcelona, Spain, April 11-13, 2012.
Labels:
call for papers,
conferences,
reminder
Friday, October 28, 2011
Conference paper submission deadline: BICS 2012
The deadline for submitting papers to the International Conference on Brain Inspired Cognitive Systems (BICS) 2012 is 15 January 2012. This conference will be held in Shenyang, China, 11-14 July, 2012.
Labels:
call for papers,
conferences
Thursday, October 27, 2011
Reminder: paper submission deadline for ISNN 2012
A reminder that the deadline for submitting papers to the 2012 International Symposium on Neural Networks (ISNN 2012) is 15 January 2012. This symposium will be held in Shenyang, China, July 11-14, 2012.
Labels:
call for papers,
conferences,
reminder
Wednesday, October 26, 2011
Reminder: Paper submission deadline for ICNC-FSKD 2012
A reminder that the deadline for submitting papers to the 8th International Conference on Natural Computation and 9th International Conference on Knowledge Discovery is 15 November 2011. These conferences will be jointly held in Chongqing, China, 29-31 May, 2011.
Labels:
call for papers,
conferences,
reminder
Tuesday, October 25, 2011
Reminder: Paper deadline for IEEE CIBCB 2012
A reminder that the deadline for papers submitted to the 2012 conference on Computational Intelligence in Bioinformatics and Computational Biology is November 20, 2011. This conference will be held in San Diego, California, May 9-12, 2012.
Labels:
call for papers,
conferences,
reminder
Monday, October 24, 2011
Paper submission deadline: CBR-MD 2012
The deadline for submitting papers to the International Workshop Case-Based Reasoning (CBR-MD) 2012 is 13 April 2012. This workshop will be held in Berlin, Germany, 20 July 2012.
Labels:
call for papers,
conferences
Friday, October 21, 2011
Call for papers: UCNC 2012
The deadline for submitting papers to the 11th Conference on Unconventional Computation and Natural Computation (UCNC) 2012 is 26 March 2012. This conference will be held in Orleans, France, 3-6 September, 2012.
Labels:
call for papers,
conferences
Thursday, October 20, 2011
Call for papers: ESANN 2012
The deadline for submitting papers to the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN) 2012 is 30 November 2011. This symposium will be held in Bruges, Belgium, 25-27 April, 2012.
Labels:
call for papers,
conferences
Wednesday, October 19, 2011
Paper submission deadline: MLDM 2012
The deadline for submitting papers to the 8th Industrial Conference on Machine Learning and Data Mining (MLDM) 2012 is 18 December 2011. This conference will be held in Berlin, Germany, 16-20 July, 2012. MLDM will be held jointly with ICDM 2012.
Labels:
call for papers,
conferences
Tuesday, October 18, 2011
Conference paper deadline: ICDM 2012
The deadline for papers submitted to the 12th Industrial Conference on Data Mining (ICDM) 2012 is 18 December 2011. This conference will be held in Berlin, Germany, 16-20 July, 2012.
Labels:
call for papers,
conferences
Monday, October 17, 2011
Paper submission deadline: ICML 2012
The deadline for submitting papers to the International Conference on Machine Learning (ICML) 2012 is 24 February 2012. This conference will be held in Edinburgh, Scotland, June 26 - July 1, 2012.
Labels:
call for papers,
conferences
Friday, October 14, 2011
On Presentations
Some presenters are applauded because the audience enjoyed their presentation. Other presenters are applauded because they ended their presentation.
Know which one you are.
Know which one you are.
Labels:
research craft
Call for papers: WIVACE 2012
The deadline for submitting papers to the Italian Workshop on Artificial Life, Evolution and Complexity (WIVACE) 2012 is 6 January 2012. This workshop will be held in Parma, Italy, 20-21 February, 2012.
Labels:
call for papers,
conferences
Thursday, October 13, 2011
Paper submission deadline: EvoStar 2012
The deadline for submitting papers to the European Conference on Evolutionary Computation (EvoStar) 2012 is 30 November 2012. This conference will be held in Malaga, Spain, 11-13 April, 2012.
Labels:
call for papers,
conferences
Wednesday, October 12, 2011
Call for papers: ICCCI 2012
The deadline for papers submitted to the 4th International Conference on Computational Collective Intelligence (ICCCI) 2012 is 15 April 2012. This conference will be held in Ho Chi Minh City, Vietnam, 28-30 November, 2012.
Labels:
call for papers,
conferences
Tuesday, October 11, 2011
Open research problems with Evolving Connectionist Systems
I described Evolving Connectionist Systems (ECoS) in an earlier post. A couple of years ago, I published a review article (PDF preprint) where I described the state of the art of ECoS, and identified several open research problems. There hasn't been much progress made in solving these problems, so I'm going to briefly describe them here, and hopefully stimulate a bit more work in this area. Of course, I'm doing a bit of work in some of these, but as I have a real job to do, I don't get as much time to spend on these problems as I'd like.
1) Input significance. With other ANN, especially the venerable MLP, it is possible to get an indication of how important each input variable is to the model. These methods are based on an analysis of the magnitude of the connection weights attached to each input neuron. This method won't work with ECoS networks, however, because the connection weights represent points in space. That is, the magnitude of the weight for an input neuron connection has nothing to do with how important that input is.
2) Optimisation of ECoS networks. While ECoS algorithms are fast learning, they can grow to be quite large, which makes them expensive in terms of memory and computational load. Ideally, it would be possible to reduce their size without sacrificing their accuracy. That is, it would be ideal if we could somehow eliminate redundant information in the ECoS and only retain that which is necessary for maintaining accuracy. I investigated a couple of methods of doing this in my PhD, and a few other people have looked at it as well, but no one has yet cracked the problem in terms of coming up with an optimisation algorithm that will significantly reduce the size of a trained ECoS network without significantly reducing its accuracy. Also, the most effective optimisation methods in the published work use evolutionary algorithms like genetic algorithms or evolution strategies. These are so computationally intensive that the speed advantages of ECoS are lost. An ECoS optimisation algorithm would ideally be as fast, or nearly as fast, as the ECoS training algorithm. It may be that this is inherently impossible.
3) Non-triangular fuzzy membership functions in EFuNN. The Evolving Fuzzy Neural Network EFuNN has triangular fuzzy membership functions (MF) embedded in its structure. These are fast and efficient, but other MF types (such as Gaussian) may be more useful for other applications.
4) Learning in the MF of EFuNN. The fuzzy MF in EFuNN are fixed, that is, they are set once and do not change during the life of the EFuNN. This is in contrast to the open, adaptive nature of EFuNN itself. An extension of the EFuNN learning algorithm that would allow the MF to adapt as the rest of the network adapts, would be extremely useful for data mining applications. This algorithm would have to be as fast as the rest of the EFuNN learning algorithm, which may rule out backpropagation training of the MF, as is used in other fuzzy system optimisation.
Although ECoS networks are very useful algorithms, they could be made even more useful if the problems above were solved. I'm working on some of them, but I would love to see others working on them as well. Contact me if you are interested in collaborating.
1) Input significance. With other ANN, especially the venerable MLP, it is possible to get an indication of how important each input variable is to the model. These methods are based on an analysis of the magnitude of the connection weights attached to each input neuron. This method won't work with ECoS networks, however, because the connection weights represent points in space. That is, the magnitude of the weight for an input neuron connection has nothing to do with how important that input is.
2) Optimisation of ECoS networks. While ECoS algorithms are fast learning, they can grow to be quite large, which makes them expensive in terms of memory and computational load. Ideally, it would be possible to reduce their size without sacrificing their accuracy. That is, it would be ideal if we could somehow eliminate redundant information in the ECoS and only retain that which is necessary for maintaining accuracy. I investigated a couple of methods of doing this in my PhD, and a few other people have looked at it as well, but no one has yet cracked the problem in terms of coming up with an optimisation algorithm that will significantly reduce the size of a trained ECoS network without significantly reducing its accuracy. Also, the most effective optimisation methods in the published work use evolutionary algorithms like genetic algorithms or evolution strategies. These are so computationally intensive that the speed advantages of ECoS are lost. An ECoS optimisation algorithm would ideally be as fast, or nearly as fast, as the ECoS training algorithm. It may be that this is inherently impossible.
3) Non-triangular fuzzy membership functions in EFuNN. The Evolving Fuzzy Neural Network EFuNN has triangular fuzzy membership functions (MF) embedded in its structure. These are fast and efficient, but other MF types (such as Gaussian) may be more useful for other applications.
4) Learning in the MF of EFuNN. The fuzzy MF in EFuNN are fixed, that is, they are set once and do not change during the life of the EFuNN. This is in contrast to the open, adaptive nature of EFuNN itself. An extension of the EFuNN learning algorithm that would allow the MF to adapt as the rest of the network adapts, would be extremely useful for data mining applications. This algorithm would have to be as fast as the rest of the EFuNN learning algorithm, which may rule out backpropagation training of the MF, as is used in other fuzzy system optimisation.
Although ECoS networks are very useful algorithms, they could be made even more useful if the problems above were solved. I'm working on some of them, but I would love to see others working on them as well. Contact me if you are interested in collaborating.
Labels:
neural networks
Subscribe to:
Posts (Atom)