As you are starting, this presentation may give you an insight
http://home.iitk.ac.in/~rhegde/ee627_2010/lec_3.4.pdf[
^].
Its title say basic, but you can't avoid fiddling with mathematics. But try to browse the books given at the end of the presentation.
The key is understanding the properties of human? voice. Then your job will be easy finding those properties from a digital signal/ byte.