|
after using 1st method to avoid wrong order linking and rebuilding solution im getting this
1>BackPropagation.obj : warning LNK4075: ignoring '/EDITANDCONTINUE' due to '/INCREMENTAL:NO' specification
1>LINK : warning LNK4098: defaultlib 'mfc90ud.lib' conflicts with use of other libs; use /NODEFAULTLIB:library
1>LINK : warning LNK4098: defaultlib 'mfcs90ud.lib' conflicts with use of other libs; use /NODEFAULTLIB:library
1>CudaMultipleBackPropagation.obj : error LNK2019: unresolved external symbol "void __cdecl FireLayer__entry(float *,float *,float *,int,int,float *)" (?FireLayer__entry@@YAXPAM00HH0@Z) referenced in function "public: void __thiscall CudaMultipleBackPropagation::DeviceLayer::Fire(int)" (?Fire@DeviceLayer@CudaMultipleBackPropagation@@QAEXH@Z)
1>CudaMultipleBackPropagation.obj : error LNK2019: unresolved external symbol "void __cdecl KernelFireLayer(int,struct dim3 &,int,float *,float *,float *,int,int,float *,int)" (?KernelFireLayer@@YAXHAAUdim3@@HPAM11HH1H@Z) referenced in function "public: void __thiscall CudaMultipleBackPropagation::DeviceLayer::Fire(int)" (?Fire@DeviceLayer@CudaMultipleBackPropagation@@QAEXH@Z)
1>CudaMultipleBackPropagation.obj : error LNK2019: unresolved external symbol "void __cdecl FireOutputLayer__entry(float *,float *,float *,int,int,float *,float *,float *,float *,float *)" (?FireOutputLayer__entry@@YAXPAM00HH00000@Z) referenced in function "public: void __thiscall CudaMultipleBackPropagation::DeviceLayer::Fire(int)" (?Fire@DeviceLayer@CudaMultipleBackPropagation@@QAEXH@Z)
1>C
obviously its different problem now with LNK2019 because CudaMultipleBackPropagation.obj is defined in linker inputs
|
|
|
|
|
You seem to be getting nowhere fast on this and it all revolves around CudaMultipleBackPropagation.obj as far as I can see. I don't know where this module comes from but if you have the source you may want to try rebuilding it. It may be that this object contains a #pragma comment that includes the library that is causing the conflict.
txtspeak is the realm of 9 year old children, not developers. Christian Graus
|
|
|
|
|
Rebuilding dont help and there is no pragma but i have a source
KERNEL FireLayer(CUDA_FLOATING_TYPE * inputs, CUDA_FLOATING_TYPE * weights, CUDA_FLOATING_TYPE * m, int mOffset, int totalNeuronsWithSelectiveActivation, CUDA_FLOATING_TYPE * outputs);
KERNEL FireOutputLayer(CUDA_FLOATING_TYPE * inputs, CUDA_FLOATING_TYPE * weights, CUDA_FLOATING_TYPE * m, int mOffset, int totalNeuronsWithSelectiveActivation, CUDA_FLOATING_TYPE * desiredOutputs, CUDA_FLOATING_TYPE * outputs, CUDA_FLOATING_TYPE * localGradient, CUDA_FLOATING_TYPE * rms, CUDA_FLOATING_TYPE * localGradientSpaceNet);
void KernelFireLayer(cudaStream_t stream, dim3 & gridSize, int blockSize, CUDA_FLOATING_TYPE * inputs, CUDA_FLOATING_TYPE * weights, CUDA_FLOATING_TYPE * m, int mOffset, int totalNeuronsWithSelectiveActivation, CUDA_FLOATING_TYPE * outputs, int numInputs);
void KernelFireOutputLayer(cudaStream_t stream, dim3 & gridSize, int blockSize, CUDA_FLOATING_TYPE * inputs, CUDA_FLOATING_TYPE * weights, CUDA_FLOATING_TYPE * m, int mOffset, int totalNeuronsWithSelectiveActivation, CUDA_FLOATING_TYPE * desiredOutputs, CUDA_FLOATING_TYPE * outputs, CUDA_FLOATING_TYPE * localGradient, CUDA_FLOATING_TYPE * rms, CUDA_FLOATING_TYPE * localGradientSpaceNet, int numInputs);
and bug
1>CudaMultipleBackPropagation.obj : error LNK2019: unresolved external symbol "void __cdecl FireLayer__entry(float *,float *,float *,int,int,float *)" (?FireLayer__entry@@YAXPAM00HH0@Z) referenced in function "public: void __thiscall CudaMultipleBackPropagation::DeviceLayer::Fire(int)" (?Fire@DeviceLayer@CudaMultipleBackPropagation@@QAEXH@Z)
1>CudaMultipleBackPropagation.obj : error LNK2019: unresolved external symbol "void __cdecl KernelFireLayer(int,struct dim3 &,int,float *,float *,float *,int,int,float *,int)" (?KernelFireLayer@@YAXHAAUdim3@@HPAM11HH1H@Z) referenced in function "public: void __thiscall CudaMultipleBackPropagation::DeviceLayer::Fire(int)" (?Fire@DeviceLayer@CudaMultipleBackPropagation@@QAEXH@Z)
So it complains about this call: CudaMultipleBackPropagation::DeviceLayer::Fire(int)
it is defined like this
void Fire(cudaStream_t stream);
Than it is mapped to FireLayer. So does it mean that the arguments of CudaMultipleBackPropagation::DeviceLayer::Fire so cudaStream_t stream dont map
to arguments of FireLayer ?? Why compilation passes than ??? What does it mean __entry after FireLayer ??
Whole class
#include "../cuda.h"
#include "../MultipleBackPropagation.h"
#include "../../Common/CUDA/CudaDefinitions.h"
#include "../../Common/CUDA/Arrays/DeviceArray.h"
#include "../../Common/CUDA/Arrays/HostArray.h"
class CudaMultipleBackPropagation {
private:
class DeviceLayer {
friend class CudaMultipleBackPropagation;
private:
static int neuronsWithSelectiveActivation;
int patterns;
int neurons;
int inputs;
int inputsWithoutBias;
int connections;
DeviceArray<CUDA_FLOATING_TYPE> weights;
DeviceArray<CUDA_FLOATING_TYPE> bestWeights;
DeviceArray<CUDA_FLOATING_TYPE> learnRate;
DeviceArray<CUDA_FLOATING_TYPE> lastDelta;
DeviceArray<CUDA_FLOATING_TYPE> lastDeltaWithoutLearningMomentum;
DeviceArray<CUDA_FLOATING_TYPE> outputs;
DeviceArray<CUDA_FLOATING_TYPE> localGradient;
CUDA_FLOATING_TYPE * inputValues;
CUDA_FLOATING_TYPE * desOutputs;
CUDA_FLOATING_TYPE * m;
int mOffset;
CUDA_FLOATING_TYPE * lgSpaceNet;
CUDA_FLOATING_TYPE * rms;
dim3 dimNeuronsPatterns;
dim3 dimInputsNeurons;
dim3 dimOutputsNeurons;
int inputsBlockSize;
int sharedMemFire;
int sharedMemGradients;
bool isOutputLayer;
public:
DeviceLayer(HostArray<CUDA_FLOATING_TYPE> & hweights, HostArray<CUDA_FLOATING_TYPE> & hlearnRate, HostArray<CUDA_FLOATING_TYPE> & hlastDelta, HostArray<CUDA_FLOATING_TYPE> & hlastDeltaWithoutLearningMomentum, DeviceArray<CUDA_FLOATING_TYPE> * layerInputs, int inputs, int neurons, int nextLayerNeurons, int patterns, CUDA_FLOATING_TYPE * m, int mOffset, CUDA_FLOATING_TYPE * lgSpaceNet) : weights(hweights), learnRate(hlearnRate), lastDelta(hlastDelta), lastDeltaWithoutLearningMomentum(hlastDeltaWithoutLearningMomentum), outputs(neurons * patterns), localGradient(neurons * patterns), dimNeuronsPatterns(neurons, patterns), dimInputsNeurons(inputs, neurons), bestWeights(hweights.Lenght()), dimOutputsNeurons(nextLayerNeurons, neurons) {
connections = hweights.Lenght();
this->m = m;
this->mOffset = mOffset;
this->lgSpaceNet = lgSpaceNet;
this->inputs = inputs;
this->neurons = neurons;
this->patterns = patterns;
inputsWithoutBias = inputs - 1;
inputsBlockSize = 1;
while(inputsBlockSize < MAX_THREADS_PER_BLOCK && inputsBlockSize < inputs) inputsBlockSize <<= 1;
sharedMemFire = weights.Lenght() * sizeof(CUDA_FLOATING_TYPE);
sharedMemGradients = (nextLayerNeurons * (neurons + 1)) * sizeof(CUDA_FLOATING_TYPE);
inputValues = layerInputs->Pointer();
desOutputs = rms = NULL;
isOutputLayer = false;
}
void DefineOutputLayer(CudaMultipleBackPropagation * cmbp) {
isOutputLayer = true;
desOutputs = cmbp->d_desOutputs->Pointer();
rms = cmbp->d_rms->Pointer();
sharedMemFire += neurons * sizeof(CUDA_FLOATING_TYPE);
}
void Fire(cudaStream_t stream);
void CalculateLocalGradient(cudaStream_t stream, CUDA_FLOATING_TYPE * rms, CUDA_FLOATING_TYPE * bestRMS, CUDA_FLOATING_TYPE rmsGrowToApplyRobustLearning, DeviceLayer * nextLayer);
void CorrectWeights(cudaStream_t stream, int patternsBlockSize, CUDA_FLOATING_TYPE * rms, CUDA_FLOATING_TYPE * bestRMS, CUDA_FLOATING_TYPE rmsGrowToApplyRobustLearning, CUDA_FLOATING_TYPE robustFactor, CUDA_FLOATING_TYPE momentum);
};
List<DeviceLayer> layersSpaceNetwork;
List<DeviceLayer> layers;
Pointer< DeviceArray<CUDA_FLOATING_TYPE> > d_inputs;
Pointer< DeviceArray<CUDA_FLOATING_TYPE> > d_desOutputs;
Pointer< DeviceArray<CUDA_FLOATING_TYPE> > d_rms;
Pointer< DeviceArray<CUDA_FLOATING_TYPE> > d_bestRMS;
DeviceArray<CUDA_FLOATING_TYPE> d_rmsOut;
CUDA_FLOATING_TYPE * rms;
Pointer< DeviceArray<int> > d_numberWeightsLayer;
Pointer< DeviceArray<CUDA_FLOATING_TYPE *> > d_weightsLayers;
Pointer< DeviceArray<CUDA_FLOATING_TYPE *> > d_bestWeightsLayers;
Pointer< DeviceArray<CUDA_FLOATING_TYPE *> > d_learnRatesLayers;
Pointer< DeviceArray<CUDA_FLOATING_TYPE *> > d_lastDeltaLayers;
Pointer< DeviceArray<CUDA_FLOATING_TYPE *> > d_lastDeltaWithoutLMlayers;
cudaStream_t streamKernels;
cudaStream_t streamRMS;
int layersRobustTraining;
int maxNumberWeigths;
int patternsBlockSize;
CUDA_FLOATING_TYPE numberPatternsNeurons;
void CreateDeviceLayers(List<Layer> & hostLayers, List<DeviceLayer> & deviceLayers, int patterns, int * neuronsWithSelectiveActivation);
void CopyLayersToHost(List<DeviceLayer> & deviceLayers, List<Layer> & hostLayers);
public:
CudaMultipleBackPropagation(Pointer <MultipleBackPropagation> & mbp, Matrix<double> & trainInputPatterns, Matrix<double> & trainDesiredOutputPatterns);
~CudaMultipleBackPropagation();
void Train(double momentum, double spaceMomentum, bool robustLearning, double rmsGrowToApplyRobustLearning, double robustFactor);
CUDA_FLOATING_TYPE GetRMS() {
return *rms;
}
void CopyNetworkHost(Pointer <MultipleBackPropagation> & mbp);
};
#endif
#include "CudaMultipleBackPropagation.h"
#include "MBPkernels.h"
int CudaMultipleBackPropagation::DeviceLayer::neuronsWithSelectiveActivation = 0;
void CudaMultipleBackPropagation::DeviceLayer::Fire(cudaStream_t stream) {
if (isOutputLayer) {
if(connections > MAX_THREADS_PER_BLOCK) {
KernelFireOutputLayer(stream, dimNeuronsPatterns, inputsBlockSize, inputValues, weights.Pointer(), m, mOffset, neuronsWithSelectiveActivation, desOutputs, outputs.Pointer(), localGradient.Pointer(), rms, lgSpaceNet, inputsWithoutBias);
} else {
FireOutputLayer<<<patterns, dimInputsNeurons, sharedMemFire, stream>>>(inputValues, weights.Pointer(), m, mOffset, neuronsWithSelectiveActivation, desOutputs, outputs.Pointer(), localGradient.Pointer(), rms, lgSpaceNet);
}
} else {
if(connections > MAX_THREADS_PER_BLOCK) {
KernelFireLayer(stream, dimNeuronsPatterns, inputsBlockSize, inputValues, weights.Pointer(), m, mOffset, neuronsWithSelectiveActivation, outputs.Pointer(), inputsWithoutBias);
} else {
FireLayer<<<patterns, dimInputsNeurons, sharedMemFire, stream>>>(inputValues, weights.Pointer(), m, mOffset, neuronsWithSelectiveActivation, outputs.Pointer());
}
}
}
void CudaMultipleBackPropagation::DeviceLayer::CalculateLocalGradient(cudaStream_t stream, CUDA_FLOATING_TYPE * rms, CUDA_FLOATING_TYPE * bestRMS, CUDA_FLOATING_TYPE rmsGrowToApplyRobustLearning, DeviceLayer * nextLayer) {
::CalculateLocalGradient<<<patterns, dimOutputsNeurons, sharedMemGradients, stream>>>(rms, bestRMS, rmsGrowToApplyRobustLearning, outputs.Pointer(), nextLayer->weights.Pointer(), m, mOffset, neuronsWithSelectiveActivation, nextLayer->localGradient.Pointer(), localGradient.Pointer(), lgSpaceNet);
}
void CudaMultipleBackPropagation::DeviceLayer::CorrectWeights(cudaStream_t stream, int patternsBlockSize, CUDA_FLOATING_TYPE * rms, CUDA_FLOATING_TYPE * bestRMS, CUDA_FLOATING_TYPE rmsGrowToApplyRobustLearning, CUDA_FLOATING_TYPE robustFactor, CUDA_FLOATING_TYPE momentum) {
KernelCorrectLayerWeights(stream, dimInputsNeurons, patternsBlockSize, rms, bestRMS, rmsGrowToApplyRobustLearning, inputValues, localGradient.Pointer(), weights.Pointer(), learnRate.Pointer(), lastDeltaWithoutLearningMomentum.Pointer(), lastDelta.Pointer(), (CUDA_FLOATING_TYPE) Connection::u, (CUDA_FLOATING_TYPE) Connection::d, robustFactor, momentum, patterns);
}
void CudaMultipleBackPropagation::CreateDeviceLayers(List<Layer> & hostLayers, List<DeviceLayer> & deviceLayers, int patterns, int * neuronsWithSelectiveActivation) {
Layer * l = hostLayers.First();
int inputsWithoutBias = l->neurons.Lenght();
DeviceArray<CUDA_FLOATING_TYPE> * layerInputs = d_inputs;
DeviceLayer * outputLayerSpaceNetwork = layersSpaceNetwork.Last();
CUDA_FLOATING_TYPE * m = (neuronsWithSelectiveActivation == NULL) ? NULL : outputLayerSpaceNetwork->outputs.Pointer();
CUDA_FLOATING_TYPE * lgSpaceNet = (neuronsWithSelectiveActivation == NULL) ? NULL : outputLayerSpaceNetwork->localGradient.Pointer();
int mOffset = 0;
Layer * nextLayer = hostLayers.Next();
for (int ln = 1; (l = nextLayer) != NULL; ln++) {
int neurons = l->neurons.Lenght();
int inputs = inputsWithoutBias + 1;
int connections = inputs * neurons;
if (connections > maxNumberWeigths) maxNumberWeigths = connections;
HostArray<CUDA_FLOATING_TYPE> weights(connections);
HostArray<CUDA_FLOATING_TYPE> learningRate(connections);
HostArray<CUDA_FLOATING_TYPE> lDelta(connections);
HostArray<CUDA_FLOATING_TYPE> lastDeltaWithoutLearningMomentum(connections);
int w = 0;
for(NeuronWithInputConnections * n = static_cast<NeuronWithInputConnections *> (l->neurons.First()); n != NULL; n = static_cast<NeuronWithInputConnections *> (l->neurons.Next())) {
for(Connection * c = n->inputs.First(); c != NULL; c = n->inputs.Next()) {
weights[w] = (CUDA_FLOATING_TYPE) c->weight;
learningRate[w] = (CUDA_FLOATING_TYPE) c->learningRate;
lDelta[w] = (CUDA_FLOATING_TYPE) c->delta;
lastDeltaWithoutLearningMomentum[w] = (CUDA_FLOATING_TYPE) c->lastDeltaWithoutLearningMomentum;
w++;
}
}
int numberNeuronsWithSelectiveActivation = (m == NULL) ? 0 : neuronsWithSelectiveActivation[ln];
CUDA_FLOATING_TYPE * ml = (numberNeuronsWithSelectiveActivation) ? m : NULL;
CUDA_FLOATING_TYPE * lgSpaceNetl = (numberNeuronsWithSelectiveActivation) ? lgSpaceNet : NULL;
nextLayer = hostLayers.Next();
int nextLayerNeurons = (nextLayer == NULL) ? 0 : nextLayer->neurons.Lenght();
DeviceLayer * dl = new DeviceLayer(weights, learningRate, lDelta, lastDeltaWithoutLearningMomentum, layerInputs, inputs, neurons, nextLayerNeurons, patterns, ml, mOffset, lgSpaceNetl);
deviceLayers.Add(dl);
mOffset += numberNeuronsWithSelectiveActivation;
layerInputs = &(dl->outputs);
inputsWithoutBias = neurons;
}
}
CudaMultipleBackPropagation::CudaMultipleBackPropagation(Pointer <MultipleBackPropagation> & mbp, Matrix<double> & trainInputPatterns, Matrix<double> & trainDesiredOutputPatterns) : d_rmsOut(1) {
int patterns = trainInputPatterns.Rows();
int ninputs = mbp->Inputs();
int noutputs = mbp->Outputs();
HostArray<CUDA_FLOATING_TYPE> inputs(ninputs * patterns);
HostArray<CUDA_FLOATING_TYPE> desiredOutputs(noutputs * patterns);
for(int p = 0; p < patterns; p++) {
for (int i = 0; i < ninputs; i++) inputs[p * ninputs + i] = (CUDA_FLOATING_TYPE) trainInputPatterns[p][i];
for (int o = 0; o < noutputs; o++) desiredOutputs[p * noutputs + o] = (CUDA_FLOATING_TYPE) trainDesiredOutputPatterns[p][o];
}
d_inputs = new DeviceArray<CUDA_FLOATING_TYPE>(inputs);
d_desOutputs = new DeviceArray<CUDA_FLOATING_TYPE>(desiredOutputs);
maxNumberWeigths = 0;
int * neuronsWithSelectiveActivation = NULL;
if (!mbp->spaceNetwork.IsNull()) {
CreateDeviceLayers(mbp->spaceNetwork->layers, layersSpaceNetwork, patterns, NULL);
neuronsWithSelectiveActivation = mbp->neuronsWithSelectiveActivation.Pointer();
DeviceLayer::neuronsWithSelectiveActivation = layersSpaceNetwork.Last()->neurons;
}
CreateDeviceLayers(mbp->layers, layers, patterns, neuronsWithSelectiveActivation);
DeviceLayer * dlOut = layers.Last();
layersRobustTraining = layersSpaceNetwork.Lenght() + layers.Lenght();
HostArray<int> numberWeightsLayer(layersRobustTraining);
HostArray<CUDA_FLOATING_TYPE *> weightsLayers(layersRobustTraining);
HostArray<CUDA_FLOATING_TYPE *> bestWeightsLayers(layersRobustTraining);
HostArray<CUDA_FLOATING_TYPE *> learnRatesLayers(layersRobustTraining);
HostArray<CUDA_FLOATING_TYPE *> lastDeltaLayers(layersRobustTraining);
HostArray<CUDA_FLOATING_TYPE *> lastDeltaWithoutLMlayers(layersRobustTraining);
int ll = 0;
for(DeviceLayer * l = layersSpaceNetwork.First(); l != NULL; l = layersSpaceNetwork.Next()) {
numberWeightsLayer[ll] = l->connections;
weightsLayers[ll] = l->weights.Pointer();
bestWeightsLayers[ll] = l->bestWeights.Pointer();
learnRatesLayers[ll] = l->learnRate.Pointer();
lastDeltaLayers[ll] = l->lastDelta.Pointer();
lastDeltaWithoutLMlayers[ll] = l->lastDeltaWithoutLearningMomentum.Pointer();
ll++;
}
for(DeviceLayer * l = layers.First(); l != NULL; l = layers.Next()) {
numberWeightsLayer[ll] = l->connections;
weightsLayers[ll] = l->weights.Pointer();
bestWeightsLayers[ll] = l->bestWeights.Pointer();
learnRatesLayers[ll] = l->learnRate.Pointer();
lastDeltaLayers[ll] = l->lastDelta.Pointer();
lastDeltaWithoutLMlayers[ll] = l->lastDeltaWithoutLearningMomentum.Pointer();
ll++;
}
d_numberWeightsLayer = new DeviceArray<int>(numberWeightsLayer);
d_weightsLayers = new DeviceArray<CUDA_FLOATING_TYPE *>(weightsLayers);
d_bestWeightsLayers = new DeviceArray<CUDA_FLOATING_TYPE *>(bestWeightsLayers);
d_learnRatesLayers = new DeviceArray<CUDA_FLOATING_TYPE *>(learnRatesLayers);
d_lastDeltaLayers = new DeviceArray<CUDA_FLOATING_TYPE *>(lastDeltaLayers);
d_lastDeltaWithoutLMlayers = new DeviceArray<CUDA_FLOATING_TYPE *>(lastDeltaWithoutLMlayers);
int sizeRMSvector = (dlOut->connections > MAX_THREADS_PER_BLOCK) ? patterns * dlOut->neurons : patterns;
d_rms = new DeviceArray<CUDA_FLOATING_TYPE>(sizeRMSvector);
dlOut->DefineOutputLayer(this);
HostArray<CUDA_FLOATING_TYPE> h_bestRMS(1);
h_bestRMS[0] = (patterns * CUDA_VALUE(3.0));
d_bestRMS = new DeviceArray<CUDA_FLOATING_TYPE>(h_bestRMS);
cudaMallocHost((void**) &rms, sizeof(CUDA_FLOATING_TYPE));
*rms = CUDA_VALUE(1.0);
patternsBlockSize = 1;
while(patternsBlockSize < MAX_THREADS_PER_BLOCK && patternsBlockSize < patterns) patternsBlockSize <<= 1;
numberPatternsNeurons = (CUDA_FLOATING_TYPE) patterns * (CUDA_FLOATING_TYPE) dlOut->neurons;
cudaStreamCreate(&streamKernels);
cudaStreamCreate(&streamRMS);
}
CudaMultipleBackPropagation::~CudaMultipleBackPropagation() {
cudaStreamDestroy(streamKernels);
cudaStreamDestroy(streamRMS);
*rms = CUDA_VALUE(1.0);
cudaFreeHost(rms);
}
void CudaMultipleBackPropagation::Train(double momentum, double spaceMomentum, bool robustLearning, double rmsGrowToApplyRobustLearning, double robustFactor) {
for(DeviceLayer * l = layersSpaceNetwork.First(); l != NULL; l = layersSpaceNetwork.Next()) l->Fire(streamKernels);
for(DeviceLayer * l = layers.First(); l != NULL; l = layers.Next()) l->Fire(streamKernels);
if (robustLearning) {
KernelCalculateRMS(streamKernels, patternsBlockSize, d_rms->Pointer(), d_rmsOut.Pointer(), d_rms->Lenght(), numberPatternsNeurons);
if (cudaStreamQuery(streamRMS) == cudaSuccess) cudaMemcpyAsync(rms, d_rmsOut.Pointer(), sizeof(CUDA_FLOATING_TYPE), cudaMemcpyDeviceToHost, streamRMS);
RobustLearning<<<1, maxNumberWeigths, 0, streamKernels>>>(d_rmsOut.Pointer(), d_bestRMS->Pointer(), (CUDA_FLOATING_TYPE) rmsGrowToApplyRobustLearning, layersRobustTraining, d_numberWeightsLayer->Pointer(), d_weightsLayers->Pointer(), d_bestWeightsLayers->Pointer(), d_learnRatesLayers->Pointer(), robustFactor, d_lastDeltaWithoutLMlayers->Pointer(), d_lastDeltaLayers->Pointer());
} else {
if (cudaStreamQuery(streamRMS) == cudaSuccess) {
KernelCalculateRMS(streamRMS, patternsBlockSize, d_rms->Pointer(), d_rmsOut.Pointer(), d_rms->Lenght(), numberPatternsNeurons);
cudaMemcpyAsync(rms, d_rmsOut.Pointer(), sizeof(CUDA_FLOATING_TYPE), cudaMemcpyDeviceToHost, streamRMS);
}
}
CUDA_FLOATING_TYPE * rms = (robustLearning) ? d_rmsOut.Pointer() : NULL;
CUDA_FLOATING_TYPE * bestRMS = (robustLearning) ? d_bestRMS->Pointer() : NULL;
DeviceLayer * nextLayer = layers.Last();
for(DeviceLayer * l = layers.Previous(); l != NULL; l = layers.Previous()) {
l->CalculateLocalGradient(streamKernels, rms, bestRMS, (CUDA_FLOATING_TYPE) rmsGrowToApplyRobustLearning, nextLayer);
nextLayer = l;
}
nextLayer = layersSpaceNetwork.Last();
for(DeviceLayer * l = layersSpaceNetwork.Previous(); l != NULL; l = layersSpaceNetwork.Previous()) {
l->CalculateLocalGradient(streamKernels, rms, bestRMS, (CUDA_FLOATING_TYPE) rmsGrowToApplyRobustLearning, nextLayer);
nextLayer = l;
}
for(DeviceLayer * l = layers.Last(); l != NULL; l = layers.Previous()) l->CorrectWeights(streamKernels, patternsBlockSize, rms, bestRMS, rmsGrowToApplyRobustLearning, robustFactor, momentum);
for(DeviceLayer * l = layersSpaceNetwork.Last(); l != NULL; l = layersSpaceNetwork.Previous()) l->CorrectWeights(streamKernels, patternsBlockSize, rms, bestRMS, rmsGrowToApplyRobustLearning, robustFactor, spaceMomentum);
}
void CudaMultipleBackPropagation::CopyLayersToHost(List<DeviceLayer> & deviceLayers, List<Layer> & hostLayers) {
hostLayers.First();
for(DeviceLayer * l = deviceLayers.First(); l != NULL; l = layers.Next()) {
Layer * hl = hostLayers.Next();
HostArray<CUDA_FLOATING_TYPE> dweights(l->weights);
HostArray<CUDA_FLOATING_TYPE> dlearnRate(l->learnRate);
HostArray<CUDA_FLOATING_TYPE> dlastDelta(l->lastDelta);
HostArray<CUDA_FLOATING_TYPE> dlastDeltaWithoutLearningMomentum(l->lastDeltaWithoutLearningMomentum);
int w = 0;
for(NeuronWithInputConnections * n = static_cast<NeuronWithInputConnections *> (hl->neurons.First()); n != NULL; n = static_cast<NeuronWithInputConnections *> (hl->neurons.Next())) {
for(Connection * c = n->inputs.First(); c != NULL; c = n->inputs.Next()) {
c->weight = dweights[w];
c->learningRate = dlearnRate[w];
c->delta = dlastDelta[w];
c->lastDeltaWithoutLearningMomentum = dlastDeltaWithoutLearningMomentum[w];
w++;
}
}
}
}
void CudaMultipleBackPropagation::CopyNetworkHost(Pointer <MultipleBackPropagation> & mbp) {
if (!mbp->spaceNetwork.IsNull()) CopyLayersToHost(layersSpaceNetwork, mbp->spaceNetwork->layers);
CopyLayersToHost(layers, mbp->layers);
|
|
|
|
|
Where is the implementation of KernelFireLayer() . Once again you are calling some function in your code that is not being included in your link process. I have no idea what part of this is code that you have written and what part comes from some external library, but that seems to be the issue you need to resolve.
txtspeak is the realm of 9 year old children, not developers. Christian Graus
|
|
|
|
|
There is no external library, i Just posted the header of the function.
Obviously the bug is here i think, for all those functions LNK2019 occurs. But i dont know what this __entry means.
So it looks that because :Fire(cudaStream_t stream) has just 2 parameters and inside functions much more and they are not seen
from inside of Fire function so LNK2016 occurs. Do you agree with me ??
this cudaStream_t stream can not find anywhere how it is defined
Its not my code BTW.
void CudaMultipleBackPropagation::DeviceLayer::Fire(cudaStream_t stream) {
if (isOutputLayer) {
if(connections > MAX_THREADS_PER_BLOCK) {
KernelFireOutputLayer(stream, dimNeuronsPatterns, inputsBlockSize, inputValues, weights.Pointer(), m, mOffset, neuronsWithSelectiveActivation, desOutputs, outputs.Pointer(), localGradient.Pointer(), rms, lgSpaceNet, inputsWithoutBias);
} else {
FireOutputLayer<<<patterns, dimInputsNeurons, sharedMemFire, stream>>>(inputValues, weights.Pointer(), m, mOffset, neuronsWithSelectiveActivation, desOutputs, outputs.Pointer(), localGradient.Pointer(), rms, lgSpaceNet);
}
} else {
if(connections > MAX_THREADS_PER_BLOCK) {
KernelFireLayer(stream, dimNeuronsPatterns, inputsBlockSize, inputValues, weights.Pointer(), m, mOffset, neuronsWithSelectiveActivation, outputs.Pointer(), inputsWithoutBias);
} else {
FireLayer<<<patterns, dimInputsNeurons, sharedMemFire, stream>>>(inputValues, weights.Pointer(), m, mOffset, neuronsWithSelectiveActivation, outputs.Pointer());
}
}
}
<pre>KERNEL FireLayer(CUDA_FLOATING_TYPE * inputs, CUDA_FLOATING_TYPE * weights, CUDA_FLOATING_TYPE * m, int mOffset, int totalNeuronsWithSelectiveActivation, CUDA_FLOATING_TYPE * outputs) {
extern __shared__ CUDA_FLOATING_TYPE iw[];
int connection = NEURON * NUM_INPUTS_INCLUDING_BIAS + INPUT;
SumInputWeight(connection, inputs, weights);
if (INPUT == 0) {
int n = PATTERN * NUM_NEURONS + NEURON;
CUDA_FLOATING_TYPE output = CUDA_SIGMOID(iw[THREAD_ID]);
if (m != NULL) output *= m[PATTERN * totalNeuronsWithSelectiveActivation + NEURON + mOffset];
outputs[n] = output;
}
}
</pre>
<div class="signature"><div class="modified">modified on Sunday, March 28, 2010 7:27 PM</div></div>
|
|
|
|
|
Krzysiaczek99 wrote: :Fire(cudaStream_t stream) has just 2 parameters
That's one parameter.
Krzysiaczek99 wrote: But i dont know what this __entry means.
Just a way the compiler has of generating entry point names.
It certainly looks like there are some mismatches between function calls and definitions. Since this is not your code your first port of call should have been the person whose code it is rather than posting it here. I suggest you try that route now.
txtspeak is the realm of 9 year old children, not developers. Christian Graus
|
|
|
|
|
Yes i will speak to him. I think he forgot to give (cudaStream_t stream) file with this class definition, I already found this file in another release of his code
|
|
|
|
|
Are you trying to use Nvidia's CUDA or are you trying to link in a lib file that you compile yourself? If the former, is this your first project using CUDA? I think there are several examples you can download.
|
|
|
|
|
No its not my fist CUDA project but not my code. Anyway the reason of those errors is missing library with one class.
|
|
|
|
|
Hi all
I have a VB project that I wish to port to VC. The VB project makes use of a dll file, for which I do not have a .lib or .h file. is there any way I can do what I am trying to do?
Sorry if this is a dumn question!
Cheers
Mike
|
|
|
|
|
Assuming you have the VB source code, you will just need to convert the dll function declarations to C++ declarations.
|
|
|
|
|
Thanks for taking the time to help!
Previously when I have done a similar thing to this, I have had an associated header and lib file. I would normally go to Projects, settings, and add "MyLib.lib" to object\library modules. Is there another way of adding the dll to the project?
Again thanks for any help
Mike
|
|
|
|
|
Well, since you don't have a .lib file for the dll, I don't see that you need to add it to the project, per se.
In order to call the functions in the dll you'll have to use LoadLibrary , and GetProcAddress .
Are you familiar with those two functions?
|
|
|
|
|
no - just looking them up!
|
|
|
|
|
well, Im part of the way there!
I have tried this:
HINSTANCE hDLL = NULL;
FARPROC DLLFunc;
hDLL = LoadLibrary("MyDLL.dll");
if(hDLL)
{
DLLFunc = GetProcAddress(hDLL,"MyFunction");
FreeLibrary(hDLL);
}
I get a handle to the dll, but GetProcAddress fails. I have tried this with a different dll, and this appeared to work, so may be there is something unusual with MyDLL.dll - is there a way of finding out what functions are in the dll?
The VB program declares a variable as well - "Dim WithEvents obj As CVeronixCtl" how would this be handled?
Thanks again for your time and patience!
Mike
|
|
|
|
|
The spelling of the function name in GetProcAddress must be exact.
You can also try calling GetLastError like this:
DWORD Error = 0;
DLLFunc = GetProcAddress(hDLL, "MyFunction" );
if ( ! DLLFunc )
Error = GetLastError();
This will tell you the reason that the call failed.
You can find out the exported functions inside the DLL, by using the program Dumpbin that comes with the SDK.
dumpbin /EXPORTS MyDll.dll
|
|
|
|
|
I am trying to figure out why ReadFile returns different values when I step thru the program and when I run it. Both in debug mode. I am using ReadFile to get data from COM port.
When stepping thru it returns expected ERROR_IO_PENDING, but when running it returns “The system cannot find the file specified”.
I am checking for ERROR_IO_PENDING and do GetLastError when it fails.
The GetLastError returns the above missing file message.
I can do WaitForSingleObject using timeout. I do not have my program finished so I cannot actually do any real data reading from the COM port.
My guess is that I have some error in my setup and the ERROR_IO_PENDING message is masking it.
Any constructive comments as always are appreciated.
Thank you for your time.
Vaclav
Update
I found one error in my setup - the receiving buffer size was not defined.
Now I am getting the “The system cannot find the file specified” all the time - consitently. I think I can fix it now.
Solution / lesson learned :
When using an API which has GetLastError option use it before you do any API return value processing. My ill-conceived program logic - checking the return value first - got me into trouble.
-- Modified Sunday, March 28, 2010 10:47 AM
|
|
|
|
|
Probably posting the relevant code would help.
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
OK here is the code - just the base:
iResult = ReadFile(m_hFile,pData,dwBytes,pdwRead,&osRead);
if (!iResult) {
long lLastError = ::GetLastError();
if (lLastError == ERROR_IO_PENDING)
{
....
TRACE("ReadFile(m_hFile,&pData,1,pdwRead,NULL)) ERROR_IO_PENDING \n ");
}
else
{
....
TRACE("ReadFile(m_hFile,&pData,1,pdwRead,NULL)) Unknown errof - read failed \n ");
pSupport = new COpenHR_Support;
pSupport->GetLastError_("Failed \nReadFile(m_hFile,&pData,1,pdwRead,&osRead)",lLastError);
}
}
else
{
.....
}
|
|
|
|
|
Guys,
Iam working with sqlite using c++, while i am trying to creating tables iam getting exception CppSQLite exception. I was already added CppSQLite3.h as header file. What i written is:
CppSQLite3.h
const *char gszfile="d:\\temp.db";
CppSQLit3DB db;
db. execDML("create table emp(cc varchar(20), subject varchar(30)");
i need solution for this, plz give me with more explanation.
|
|
|
|
|
You did not open the database.
|
|
|
|
|
hOW CAN I OPEN THE DATABASH
|
|
|
|
|
For example, your code looks like this:
const *char gszfile="d:\\temp.db";
CppSQLit3DB db;
db. execDML("create table emp(cc varchar(20), subject varchar(30)"); You are missing this before you call execDML:
db.open(gszfile); When you execute the command without opening the database, you sure will get an exception.
|
|
|
|
|
Thnx to your reply, eventhough iam trying with ur solution, still same exception exist. The exception like:
c++ exception at memory location 0x0000012ee8e
|
|
|
|
|
What line is throwing the exception?
"One man's wage rise is another man's price increase." - Harold Wilson
"Fireproof doesn't mean the fire will never come. It means when the fire comes that you will be able to withstand it." - Michael Simmons
"Man who follows car will be exhausted." - Confucius
|
|
|
|
|