It could be that they use different BLAS librairies. From memory, autoencoder bottleneck is the matrix product, that call BLAS. Different BLAS implementation can have up to 10x speed difference.
So check if you used the same BLAS. I would recommand to install python via EPD/Canopy or Anaconda python packages. There not free version link to a good blas and Theano reuse it. The now free version is free for academic.