-
Notifications
You must be signed in to change notification settings - Fork 119
Description
use command:
./runLDA.sh 1 "" train default "/user/chengmingbo/input/ydir.txt" "/user/chengmingbo/output" -1 100 5 "/user/chengmingbo/LDALibs.jar" 3
I find if the output directory take the last '/', the training process will delete all file generate by formatter.
I don't know how to handle the exception i encountered
hadoop version:0.20.2
os:Linux version 2.6.18-164.el5 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Thu Sep 3 03:28:30 EDT 2009
the information is as follows:
[ttempt failed]
Deleted hdfs://h253014:9000/user/chengmingbo/output
- /d2/hadoop_data/hadoop//bin/hadoop jar /d2/hadoop_data/hadoop//hadoop-streaming.jar -Dmapred.job.queue.name=default -Dmapred.map.tasks.speculative.execution=false -Dmapred.job.map.memory.mb=-1 -Dmapred.map.tasks=1 -Dmapred.child.ulimit=0 -Dmapred.task.timeout=1800000 -Dmapred.map.max.attempts=1 -Dmapred.max.tracker.failures=1 -Dmapreduce.job.acl-view-job=shravanm,smola -input /user/chengmingbo/output_0/input -output /user/chengmingbo/output -cacheArchive /user/chengmingbo/LDALibs.jar#LDALibs -mapper 'LDA.sh 1 " " 100 5' -file LDA.sh -file functions.sh -numReduceTasks 0
12/05/17 14:37:14 WARN streaming.StreamJob: -cacheArchive option is deprecated, please use -archives instead.
packageJobJar: [LDA.sh, functions.sh, /d2/hadoop_data/file_data_dir/hadoop_tmp_dir/hadoop-unjar455411245815288771/] [] /tmp/streamjob6164278369022733494.jar tmpDir=null
12/05/17 14:37:15 WARN snappy.LoadSnappy: Snappy native library is available
12/05/17 14:37:15 INFO util.NativeCodeLoader: Loaded the native-hadoop library
12/05/17 14:37:15 INFO snappy.LoadSnappy: Snappy native library loaded
12/05/17 14:37:15 INFO mapred.FileInputFormat: Total input paths to process : 3
12/05/17 14:37:15 INFO streaming.StreamJob: getLocalDirs(): [/d2/hadoop_data/file_data_dir/mapred_local_dir/]
12/05/17 14:37:15 INFO streaming.StreamJob: Running job: job_201205161512_0162
12/05/17 14:37:15 INFO streaming.StreamJob: To kill this job, run:
12/05/17 14:37:15 INFO streaming.StreamJob: /d2/hadoop_data/hadoop//bin/hadoop job -Dmapred.job.tracker=hdfs://h253014:9001/ -kill job_201205161512_0162
12/05/17 14:37:15 INFO streaming.StreamJob: Tracking URL: http://10.255.253.14:50030/jobdetails.jsp?jobid=job_201205161512_0162
12/05/17 14:37:16 INFO streaming.StreamJob: map 0% reduce 0%
12/05/17 14:37:42 INFO streaming.StreamJob: map 100% reduce 100%
12/05/17 14:37:42 INFO streaming.StreamJob: To kill this job, run:
12/05/17 14:37:42 INFO streaming.StreamJob: /d2/hadoop_data/hadoop//bin/hadoop job -Dmapred.job.tracker=hdfs://h253014:9001/ -kill job_201205161512_0162
12/05/17 14:37:42 INFO streaming.StreamJob: Tracking URL: http://10.255.253.14:50030/jobdetails.jsp?jobid=job_201205161512_0162
12/05/17 14:37:42 ERROR streaming.StreamJob: Job not successful. Error: NA
12/05/17 14:37:42 INFO streaming.StreamJob: killJob...
Streaming Command Failed! - exit_code=1
- set +x
[hadoop information]
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 134
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
[LAST 4 K ]
stderr logs
oop/jobcache/job_201205161512_0159/attempt_201205161512_0159_m_000001_0/work/out/learnTopics.*
W0517 14:36:01.712434 20074 Controller.cpp:100] ----------------------------------------------------------------------
W0517 14:36:01.712734 20074 Controller.cpp:115] You have chosen multi machine training mode
W0517 14:36:01.713021 20074 Unigram_Model_Training_Builder.cpp:60] Initializing Dictionary from lda.dict.dump
W0517 14:36:01.713249 20074 Unigram_Model_Training_Builder.cpp:62] Dictionary Initialized
W0517 14:36:01.713443 20074 Unigram_Model_Trainer.cpp:49] Initializing Word-Topic counts table from docs lda.wor, lda.top using 0 words & 100 topics.
W0517 14:36:01.713533 20074 Unigram_Model_Trainer.cpp:53] Initialized Word-Topic counts table
W0517 14:36:01.713568 20074 Unigram_Model_Trainer.cpp:57] Initializing Alpha vector from Alpha_bar = 50
W0517 14:36:01.713608 20074 Unigram_Model_Trainer.cpp:60] Alpha vector initialized
W0517 14:36:01.713624 20074 Unigram_Model_Trainer.cpp:63] Initializing Beta Parameter from specified Beta = 0.01
W0517 14:36:01.713641 20074 Unigram_Model_Trainer.cpp:67] Beta param initialized
Outgoing.cpp:424: Ice::ObjectNotExistException:
object does not exist:
identity: `DM_Server_0'
facet:
operation: ice_isA
terminate called after throwing an instance of 'IceUtil::Exception'
what(): Outgoing.cpp:424: IceUtil::Exception
*** Aborted at 1337236561 (unix time) try "date -d @1337236561" if you are using GNU date ***
PC: @ 0x3293430265 (unknown)
*** SIGABRT (@0x1f400004e6a) received by PID 20074 (TID 0x2b77acda63b0) from PID 20074; stack trace: ***
@ 0x329400e7c0 (unknown)
@ 0x3293430265 (unknown)
@ 0x3293431d10 (unknown)
@ 0x3299cbec44 (unknown)
@ 0x3299cbcdb6 (unknown)
@ 0x3299cbcde3 (unknown)
@ 0x3299cbceca (unknown)
@ 0x44470e DM_Client::add_server()
@ 0x44509e DM_Client::DM_Client()
@ 0x4a3f95 Unigram_Model_Synchronizer_Helper::Unigram_Model_Synchronizer_Helper()
@ 0x4a39b5 Unigram_Model_Synchronized_Training_Builder::create_execution_strategy()
@ 0x447d04 Model_Director::build_model()
@ 0x4396ab main
@ 0x329341d994 (unknown)
@ 0x40d8c9 (unknown)
/d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/attempt_201205161512_0159_m_000001_0/work/./LDA.sh: line 103: 20074 已放弃 $LDALIBS/learntopics --model=$model --iter=$iters --topics=$topics --servers="$servers" --chkptdir="${MY_INP_DIR}" $flags 1>&2
Synch directory: hdfs://h253014:9000/user/chengmingbo/output/temporary/synchronize/learntopics
Num of map tasks: 3
Found 3 items
-rw-rw-r-- 3 hadoop supergroup 9 2012-05-17 14:36 /user/chengmingbo/output/temporary/synchronize/learntopics/0
-rw-rw-r-- 3 hadoop supergroup 9 2012-05-17 14:36 /user/chengmingbo/output/temporary/synchronize/learntopics/1
-rw-rw-r-- 3 hadoop supergroup 9 2012-05-17 14:36 /user/chengmingbo/output/temporary/synchronize/learntopics/2
Num of clients done: 3
All clients done!
learntopics returned an error code of 134
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 134
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
syslog logs
2012-05-17 14:35:49,106 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2012-05-17 14:35:49,272 INFO org.apache.hadoop.mapred.TaskRunner: Creating symlink: /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/distcache/-5116373793315176501_-731644028_1402464899/h253014/user/chengmingbo/LDALibs.jar <- /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/attempt_201205161512_0159_m_000001_0/work/LDALibs
2012-05-17 14:35:49,289 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/jars/functions.sh <- /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/attempt_201205161512_0159_m_000001_0/work/functions.sh
2012-05-17 14:35:49,299 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/jars/job.jar <- /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/attempt_201205161512_0159_m_000001_0/work/job.jar
2012-05-17 14:35:49,308 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/jars/LDA.sh <- /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/attempt_201205161512_0159_m_000001_0/work/LDA.sh
2012-05-17 14:35:49,316 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/jars/.job.jar.crc <- /d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/attempt_201205161512_0159_m_000001_0/work/.job.jar.crc
2012-05-17 14:35:49,397 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2012-05-17 14:35:49,578 WARN org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library is available
2012-05-17 14:35:49,578 INFO org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library loaded
2012-05-17 14:35:49,586 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
2012-05-17 14:35:49,694 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/d2/hadoop_data/file_data_dir/mapred_local_dir/taskTracker/hadoop/jobcache/job_201205161512_0159/attempt_201205161512_0159_m_000001_0/work/./LDA.sh, 1, , 100, 5]
2012-05-17 14:35:49,775 INFO org.apache.hadoop.streaming.PipeMapRed: Records R/W=0/1
2012-05-17 14:36:06,632 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed failed!
2012-05-17 14:36:06,713 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2012-05-17 14:36:06,717 WARN org.apache.hadoop.mapred.Child: Error running child
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 134
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
2012-05-17 14:36:06,721 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task