Oracle PGX on Yarn - 404 on WebService
I'm running Yarn on Oracle BDA X7-2, specs:
- Cloudera Enterprise 5.14.3
- Java 1.8.0_171
- PGX 2.7.1
I'm trying to run PGX on Yarn following this manual:
https://docs.oracle.com/cd/E56133_01/2.5.0/tutorials/yarn.html
Managed to run the installation script, completed the config file provided by it with the following:
{
"pgx_yarn_jar_hdfs_path": "hdfs:/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs:/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs:/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs:/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs:/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs:/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05,bda1node06,bda1node07",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1
}
Yarn has a pgx-service
application in RUNNING
state, no errors in stderr, the log shows me the service is running in the address:
http://bda1node06:7007
And the linux Java process is running with the following command:
/usr/java/default/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node06 /u11/hadoop/yarn/nm/usercache/root/appcache/application_1539869144089_2070/container_e22_1539869144089_2070_01_000002/pgx-server.war 7007 bda1node05,bda1node06,bda1node07 /pgx-8eef44e2-1657-403a-8193-0102f5266680
And after the execution of the PGX client for testing purposes:
$PGX_HOME/bin/pgx --base_url http://bda1node06:7007
I get:
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at oracle.pgx.api.PgxFuture.get(PgxFuture.java:99)
at oracle.pgx.api.ServerInstance.createSession(ServerInstance.java:559)
at oracle.pgx.shell.Console.initSession(Console.java:280)
at oracle.pgx.shell.Console.(Console.java:153)
at oracle.pgx.shell.Console.main(Console.java:296)
Caused by: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at oracle.pgx.api.ClientApiProvider.lambda$versionCheck$2(ClientApiProvider.java:189)
at oracle.pgx.client.RemoteUtils.lambda$asyncRequest$5(RemoteUtils.java:278)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I have no idea of how to debug and check if there's any extra path needed in the connection URL.
How may I proceed to debug?
Thanks in advance!
bigdata yarn cloudera cloudera-manager oracle-spatial
|
show 1 more comment
I'm running Yarn on Oracle BDA X7-2, specs:
- Cloudera Enterprise 5.14.3
- Java 1.8.0_171
- PGX 2.7.1
I'm trying to run PGX on Yarn following this manual:
https://docs.oracle.com/cd/E56133_01/2.5.0/tutorials/yarn.html
Managed to run the installation script, completed the config file provided by it with the following:
{
"pgx_yarn_jar_hdfs_path": "hdfs:/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs:/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs:/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs:/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs:/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs:/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05,bda1node06,bda1node07",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1
}
Yarn has a pgx-service
application in RUNNING
state, no errors in stderr, the log shows me the service is running in the address:
http://bda1node06:7007
And the linux Java process is running with the following command:
/usr/java/default/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node06 /u11/hadoop/yarn/nm/usercache/root/appcache/application_1539869144089_2070/container_e22_1539869144089_2070_01_000002/pgx-server.war 7007 bda1node05,bda1node06,bda1node07 /pgx-8eef44e2-1657-403a-8193-0102f5266680
And after the execution of the PGX client for testing purposes:
$PGX_HOME/bin/pgx --base_url http://bda1node06:7007
I get:
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at oracle.pgx.api.PgxFuture.get(PgxFuture.java:99)
at oracle.pgx.api.ServerInstance.createSession(ServerInstance.java:559)
at oracle.pgx.shell.Console.initSession(Console.java:280)
at oracle.pgx.shell.Console.(Console.java:153)
at oracle.pgx.shell.Console.main(Console.java:296)
Caused by: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at oracle.pgx.api.ClientApiProvider.lambda$versionCheck$2(ClientApiProvider.java:189)
at oracle.pgx.client.RemoteUtils.lambda$asyncRequest$5(RemoteUtils.java:278)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I have no idea of how to debug and check if there's any extra path needed in the connection URL.
How may I proceed to debug?
Thanks in advance!
bigdata yarn cloudera cloudera-manager oracle-spatial
is there any useful output when runningyarn logs -applicationId <appId>
?
– Korbi
Nov 16 '18 at 20:33
1
another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity
– Korbi
Nov 20 '18 at 19:52
@Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!
– Samamba
Dec 6 '18 at 20:13
Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?
– Albert Godfrind
Dec 7 '18 at 18:13
@AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA
– Samamba
Dec 10 '18 at 19:45
|
show 1 more comment
I'm running Yarn on Oracle BDA X7-2, specs:
- Cloudera Enterprise 5.14.3
- Java 1.8.0_171
- PGX 2.7.1
I'm trying to run PGX on Yarn following this manual:
https://docs.oracle.com/cd/E56133_01/2.5.0/tutorials/yarn.html
Managed to run the installation script, completed the config file provided by it with the following:
{
"pgx_yarn_jar_hdfs_path": "hdfs:/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs:/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs:/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs:/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs:/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs:/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05,bda1node06,bda1node07",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1
}
Yarn has a pgx-service
application in RUNNING
state, no errors in stderr, the log shows me the service is running in the address:
http://bda1node06:7007
And the linux Java process is running with the following command:
/usr/java/default/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node06 /u11/hadoop/yarn/nm/usercache/root/appcache/application_1539869144089_2070/container_e22_1539869144089_2070_01_000002/pgx-server.war 7007 bda1node05,bda1node06,bda1node07 /pgx-8eef44e2-1657-403a-8193-0102f5266680
And after the execution of the PGX client for testing purposes:
$PGX_HOME/bin/pgx --base_url http://bda1node06:7007
I get:
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at oracle.pgx.api.PgxFuture.get(PgxFuture.java:99)
at oracle.pgx.api.ServerInstance.createSession(ServerInstance.java:559)
at oracle.pgx.shell.Console.initSession(Console.java:280)
at oracle.pgx.shell.Console.(Console.java:153)
at oracle.pgx.shell.Console.main(Console.java:296)
Caused by: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at oracle.pgx.api.ClientApiProvider.lambda$versionCheck$2(ClientApiProvider.java:189)
at oracle.pgx.client.RemoteUtils.lambda$asyncRequest$5(RemoteUtils.java:278)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I have no idea of how to debug and check if there's any extra path needed in the connection URL.
How may I proceed to debug?
Thanks in advance!
bigdata yarn cloudera cloudera-manager oracle-spatial
I'm running Yarn on Oracle BDA X7-2, specs:
- Cloudera Enterprise 5.14.3
- Java 1.8.0_171
- PGX 2.7.1
I'm trying to run PGX on Yarn following this manual:
https://docs.oracle.com/cd/E56133_01/2.5.0/tutorials/yarn.html
Managed to run the installation script, completed the config file provided by it with the following:
{
"pgx_yarn_jar_hdfs_path": "hdfs:/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs:/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs:/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs:/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs:/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs:/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05,bda1node06,bda1node07",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1
}
Yarn has a pgx-service
application in RUNNING
state, no errors in stderr, the log shows me the service is running in the address:
http://bda1node06:7007
And the linux Java process is running with the following command:
/usr/java/default/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node06 /u11/hadoop/yarn/nm/usercache/root/appcache/application_1539869144089_2070/container_e22_1539869144089_2070_01_000002/pgx-server.war 7007 bda1node05,bda1node06,bda1node07 /pgx-8eef44e2-1657-403a-8193-0102f5266680
And after the execution of the PGX client for testing purposes:
$PGX_HOME/bin/pgx --base_url http://bda1node06:7007
I get:
java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
at oracle.pgx.api.PgxFuture.get(PgxFuture.java:99)
at oracle.pgx.api.ServerInstance.createSession(ServerInstance.java:559)
at oracle.pgx.shell.Console.initSession(Console.java:280)
at oracle.pgx.shell.Console.(Console.java:153)
at oracle.pgx.shell.Console.main(Console.java:296)
Caused by: java.lang.IllegalStateException: cannot connect to server; requested http://bda1node06:7007/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
at oracle.pgx.api.ClientApiProvider.lambda$versionCheck$2(ClientApiProvider.java:189)
at oracle.pgx.client.RemoteUtils.lambda$asyncRequest$5(RemoteUtils.java:278)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I have no idea of how to debug and check if there's any extra path needed in the connection URL.
How may I proceed to debug?
Thanks in advance!
bigdata yarn cloudera cloudera-manager oracle-spatial
bigdata yarn cloudera cloudera-manager oracle-spatial
asked Nov 13 '18 at 15:23
SamambaSamamba
387
387
is there any useful output when runningyarn logs -applicationId <appId>
?
– Korbi
Nov 16 '18 at 20:33
1
another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity
– Korbi
Nov 20 '18 at 19:52
@Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!
– Samamba
Dec 6 '18 at 20:13
Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?
– Albert Godfrind
Dec 7 '18 at 18:13
@AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA
– Samamba
Dec 10 '18 at 19:45
|
show 1 more comment
is there any useful output when runningyarn logs -applicationId <appId>
?
– Korbi
Nov 16 '18 at 20:33
1
another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity
– Korbi
Nov 20 '18 at 19:52
@Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!
– Samamba
Dec 6 '18 at 20:13
Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?
– Albert Godfrind
Dec 7 '18 at 18:13
@AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA
– Samamba
Dec 10 '18 at 19:45
is there any useful output when running
yarn logs -applicationId <appId>
?– Korbi
Nov 16 '18 at 20:33
is there any useful output when running
yarn logs -applicationId <appId>
?– Korbi
Nov 16 '18 at 20:33
1
1
another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity
– Korbi
Nov 20 '18 at 19:52
another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity
– Korbi
Nov 20 '18 at 19:52
@Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!
– Samamba
Dec 6 '18 at 20:13
@Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!
– Samamba
Dec 6 '18 at 20:13
Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?
– Albert Godfrind
Dec 7 '18 at 18:13
Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?
– Albert Godfrind
Dec 7 '18 at 18:13
@AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA
– Samamba
Dec 10 '18 at 19:45
@AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA
– Samamba
Dec 10 '18 at 19:45
|
show 1 more comment
2 Answers
2
active
oldest
votes
By default, PGX has a base path of /pgx
, which means you should connect as follows:
$PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx
I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
– Samamba
Nov 13 '18 at 17:07
add a comment |
I'll do a little follow up here.
We've managed to start a pgx server and manipulate hbase graph! :D
PGX "Hello World"
We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:
cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
opg = OraclePropertyGraph.getInstance(cfg)
a = opg.addVertex()
a.setProperty('nome', 'Felipe')
b = opg.addVertex()
b.setProperty('nome', 'Rhenan')
c = opg.addVertex()
c.setProperty('nome', 'Hugo')
opg.addEdge(a, b, 'Pai de')
opg.addEdge(b, c, 'Pai de')
opg.addEdge(a, c, 'Avo de')
opg.commit()
session = Pgx.createSession('sinapsepgx')
analyst = session.createAnalyst()
pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
analyst.countTriangles(pgxGraph, true)
And that worked just fine!
Client - Server architecture
The next step, we moved to a client/server mode, starting the start-server script.
We managed to do that just fine too!
This is our config files:
server.conf
{
"port": 7007,
"enable_tls": false,
"enable_client_authentication": false
}
pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
sinapse_conf.json
{
"edge_props": [
{
"name": "relacao",
"type": "string"
}
],
"db_engine": "HBASE",
"vertex_props": [
{
"name": "nome",
"type": "string"
},
{
"name": "cpf",
"type": "string"
}
],
"format": "pg",
"name": "sinapse",
"error_handling": {},
"vertex_id_type": "long",
"attributes": {},
"loading": {},
"zk_quorum": "bda1node05,bda1node06,bda1node07"
}
start-script ran just fine with that, preloaded our hbase graph, works like a charm.
Connected to the server using the pgx client:
./bin/pgx -b http://localhost:7007
And managed to do the same we did in the groovy shell.
That's awesome.
PGX on Yarn
Well, now we are back in our challenge: run and manage PGX on Yarn.
We've copied our pgx.conf file to the hdfs, like this:
hdfs://user/pgx/pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
/opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf
{
"pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1
}
Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.
So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:
hdfs://user/pgx/log4j2.xml
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%d{HH:mm:ss,SSS} %p %C{1} - %m%n"/>
</Console>
<File name="LogFile" fileName="file:/tmp/pg_trace.log">
<PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
</File>
</Appenders>
<Loggers>
<Root level="debug">
<AppenderRef ref="LogFile"/>
</Root>
<Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
<Logger name="pgx.dist.cluster_host" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
</Loggers>
</Configuration>
And finally ran the yarn start server command, just like this:
yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf
And we get the bottom of the logfile that seems realy nice!:
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down
But connecting to it still returns 404 ;(
The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
18/12/11 16:25:06 INFO yarn.AppMaster: register app
18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
18/12/11 16:25:07 INFO yarn.AppMaster: server env = {CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR}
18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
.
.
.
This is the farthest we've managed to go.
We can start our work now! That's realy exciting.
Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.
Would be lovely to have this running on Yarn at the production level.
Thank you all for the extreme dedication and attention.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53284215%2foracle-pgx-on-yarn-404-on-webservice%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
By default, PGX has a base path of /pgx
, which means you should connect as follows:
$PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx
I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
– Samamba
Nov 13 '18 at 17:07
add a comment |
By default, PGX has a base path of /pgx
, which means you should connect as follows:
$PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx
I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
– Samamba
Nov 13 '18 at 17:07
add a comment |
By default, PGX has a base path of /pgx
, which means you should connect as follows:
$PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx
By default, PGX has a base path of /pgx
, which means you should connect as follows:
$PGX_HOME/bin/pgx --base_url http://bda1node06:7007/pgx
answered Nov 13 '18 at 16:39
MartijnMartijn
3,67342135
3,67342135
I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
– Samamba
Nov 13 '18 at 17:07
add a comment |
I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
– Samamba
Nov 13 '18 at 17:07
I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
– Samamba
Nov 13 '18 at 17:07
I get a similar response: $PGX_HOME/bin/pgx --base_url bda1node06:7007/pgx java.util.concurrent.ExecutionException: java.lang.IllegalStateException: cannot connect to server; requested bda1node06:7007/pgx/version?extendedInfo=true and expected status 200, got 404 instead; response body = ""
– Samamba
Nov 13 '18 at 17:07
add a comment |
I'll do a little follow up here.
We've managed to start a pgx server and manipulate hbase graph! :D
PGX "Hello World"
We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:
cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
opg = OraclePropertyGraph.getInstance(cfg)
a = opg.addVertex()
a.setProperty('nome', 'Felipe')
b = opg.addVertex()
b.setProperty('nome', 'Rhenan')
c = opg.addVertex()
c.setProperty('nome', 'Hugo')
opg.addEdge(a, b, 'Pai de')
opg.addEdge(b, c, 'Pai de')
opg.addEdge(a, c, 'Avo de')
opg.commit()
session = Pgx.createSession('sinapsepgx')
analyst = session.createAnalyst()
pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
analyst.countTriangles(pgxGraph, true)
And that worked just fine!
Client - Server architecture
The next step, we moved to a client/server mode, starting the start-server script.
We managed to do that just fine too!
This is our config files:
server.conf
{
"port": 7007,
"enable_tls": false,
"enable_client_authentication": false
}
pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
sinapse_conf.json
{
"edge_props": [
{
"name": "relacao",
"type": "string"
}
],
"db_engine": "HBASE",
"vertex_props": [
{
"name": "nome",
"type": "string"
},
{
"name": "cpf",
"type": "string"
}
],
"format": "pg",
"name": "sinapse",
"error_handling": {},
"vertex_id_type": "long",
"attributes": {},
"loading": {},
"zk_quorum": "bda1node05,bda1node06,bda1node07"
}
start-script ran just fine with that, preloaded our hbase graph, works like a charm.
Connected to the server using the pgx client:
./bin/pgx -b http://localhost:7007
And managed to do the same we did in the groovy shell.
That's awesome.
PGX on Yarn
Well, now we are back in our challenge: run and manage PGX on Yarn.
We've copied our pgx.conf file to the hdfs, like this:
hdfs://user/pgx/pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
/opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf
{
"pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1
}
Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.
So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:
hdfs://user/pgx/log4j2.xml
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%d{HH:mm:ss,SSS} %p %C{1} - %m%n"/>
</Console>
<File name="LogFile" fileName="file:/tmp/pg_trace.log">
<PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
</File>
</Appenders>
<Loggers>
<Root level="debug">
<AppenderRef ref="LogFile"/>
</Root>
<Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
<Logger name="pgx.dist.cluster_host" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
</Loggers>
</Configuration>
And finally ran the yarn start server command, just like this:
yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf
And we get the bottom of the logfile that seems realy nice!:
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down
But connecting to it still returns 404 ;(
The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
18/12/11 16:25:06 INFO yarn.AppMaster: register app
18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
18/12/11 16:25:07 INFO yarn.AppMaster: server env = {CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR}
18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
.
.
.
This is the farthest we've managed to go.
We can start our work now! That's realy exciting.
Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.
Would be lovely to have this running on Yarn at the production level.
Thank you all for the extreme dedication and attention.
add a comment |
I'll do a little follow up here.
We've managed to start a pgx server and manipulate hbase graph! :D
PGX "Hello World"
We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:
cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
opg = OraclePropertyGraph.getInstance(cfg)
a = opg.addVertex()
a.setProperty('nome', 'Felipe')
b = opg.addVertex()
b.setProperty('nome', 'Rhenan')
c = opg.addVertex()
c.setProperty('nome', 'Hugo')
opg.addEdge(a, b, 'Pai de')
opg.addEdge(b, c, 'Pai de')
opg.addEdge(a, c, 'Avo de')
opg.commit()
session = Pgx.createSession('sinapsepgx')
analyst = session.createAnalyst()
pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
analyst.countTriangles(pgxGraph, true)
And that worked just fine!
Client - Server architecture
The next step, we moved to a client/server mode, starting the start-server script.
We managed to do that just fine too!
This is our config files:
server.conf
{
"port": 7007,
"enable_tls": false,
"enable_client_authentication": false
}
pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
sinapse_conf.json
{
"edge_props": [
{
"name": "relacao",
"type": "string"
}
],
"db_engine": "HBASE",
"vertex_props": [
{
"name": "nome",
"type": "string"
},
{
"name": "cpf",
"type": "string"
}
],
"format": "pg",
"name": "sinapse",
"error_handling": {},
"vertex_id_type": "long",
"attributes": {},
"loading": {},
"zk_quorum": "bda1node05,bda1node06,bda1node07"
}
start-script ran just fine with that, preloaded our hbase graph, works like a charm.
Connected to the server using the pgx client:
./bin/pgx -b http://localhost:7007
And managed to do the same we did in the groovy shell.
That's awesome.
PGX on Yarn
Well, now we are back in our challenge: run and manage PGX on Yarn.
We've copied our pgx.conf file to the hdfs, like this:
hdfs://user/pgx/pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
/opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf
{
"pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1
}
Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.
So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:
hdfs://user/pgx/log4j2.xml
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%d{HH:mm:ss,SSS} %p %C{1} - %m%n"/>
</Console>
<File name="LogFile" fileName="file:/tmp/pg_trace.log">
<PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
</File>
</Appenders>
<Loggers>
<Root level="debug">
<AppenderRef ref="LogFile"/>
</Root>
<Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
<Logger name="pgx.dist.cluster_host" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
</Loggers>
</Configuration>
And finally ran the yarn start server command, just like this:
yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf
And we get the bottom of the logfile that seems realy nice!:
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down
But connecting to it still returns 404 ;(
The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
18/12/11 16:25:06 INFO yarn.AppMaster: register app
18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
18/12/11 16:25:07 INFO yarn.AppMaster: server env = {CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR}
18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
.
.
.
This is the farthest we've managed to go.
We can start our work now! That's realy exciting.
Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.
Would be lovely to have this running on Yarn at the production level.
Thank you all for the extreme dedication and attention.
add a comment |
I'll do a little follow up here.
We've managed to start a pgx server and manipulate hbase graph! :D
PGX "Hello World"
We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:
cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
opg = OraclePropertyGraph.getInstance(cfg)
a = opg.addVertex()
a.setProperty('nome', 'Felipe')
b = opg.addVertex()
b.setProperty('nome', 'Rhenan')
c = opg.addVertex()
c.setProperty('nome', 'Hugo')
opg.addEdge(a, b, 'Pai de')
opg.addEdge(b, c, 'Pai de')
opg.addEdge(a, c, 'Avo de')
opg.commit()
session = Pgx.createSession('sinapsepgx')
analyst = session.createAnalyst()
pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
analyst.countTriangles(pgxGraph, true)
And that worked just fine!
Client - Server architecture
The next step, we moved to a client/server mode, starting the start-server script.
We managed to do that just fine too!
This is our config files:
server.conf
{
"port": 7007,
"enable_tls": false,
"enable_client_authentication": false
}
pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
sinapse_conf.json
{
"edge_props": [
{
"name": "relacao",
"type": "string"
}
],
"db_engine": "HBASE",
"vertex_props": [
{
"name": "nome",
"type": "string"
},
{
"name": "cpf",
"type": "string"
}
],
"format": "pg",
"name": "sinapse",
"error_handling": {},
"vertex_id_type": "long",
"attributes": {},
"loading": {},
"zk_quorum": "bda1node05,bda1node06,bda1node07"
}
start-script ran just fine with that, preloaded our hbase graph, works like a charm.
Connected to the server using the pgx client:
./bin/pgx -b http://localhost:7007
And managed to do the same we did in the groovy shell.
That's awesome.
PGX on Yarn
Well, now we are back in our challenge: run and manage PGX on Yarn.
We've copied our pgx.conf file to the hdfs, like this:
hdfs://user/pgx/pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
/opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf
{
"pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1
}
Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.
So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:
hdfs://user/pgx/log4j2.xml
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%d{HH:mm:ss,SSS} %p %C{1} - %m%n"/>
</Console>
<File name="LogFile" fileName="file:/tmp/pg_trace.log">
<PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
</File>
</Appenders>
<Loggers>
<Root level="debug">
<AppenderRef ref="LogFile"/>
</Root>
<Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
<Logger name="pgx.dist.cluster_host" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
</Loggers>
</Configuration>
And finally ran the yarn start server command, just like this:
yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf
And we get the bottom of the logfile that seems realy nice!:
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down
But connecting to it still returns 404 ;(
The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
18/12/11 16:25:06 INFO yarn.AppMaster: register app
18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
18/12/11 16:25:07 INFO yarn.AppMaster: server env = {CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR}
18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
.
.
.
This is the farthest we've managed to go.
We can start our work now! That's realy exciting.
Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.
Would be lovely to have this running on Yarn at the production level.
Thank you all for the extreme dedication and attention.
I'll do a little follow up here.
We've managed to start a pgx server and manipulate hbase graph! :D
PGX "Hello World"
We wrote a small code to insert vertices, edgex, instantiate pgx and run a simple example, this is it:
cfg = GraphConfigBuilder.forPropertyGraphHbase().setName('sinapse').setZkQuorum('bda1node05').build()
opg = OraclePropertyGraph.getInstance(cfg)
a = opg.addVertex()
a.setProperty('nome', 'Felipe')
b = opg.addVertex()
b.setProperty('nome', 'Rhenan')
c = opg.addVertex()
c.setProperty('nome', 'Hugo')
opg.addEdge(a, b, 'Pai de')
opg.addEdge(b, c, 'Pai de')
opg.addEdge(a, c, 'Avo de')
opg.commit()
session = Pgx.createSession('sinapsepgx')
analyst = session.createAnalyst()
pgxGraph = session.readGraphWithProperties(opg.getConfig(), true)
analyst.countTriangles(pgxGraph, true)
And that worked just fine!
Client - Server architecture
The next step, we moved to a client/server mode, starting the start-server script.
We managed to do that just fine too!
This is our config files:
server.conf
{
"port": 7007,
"enable_tls": false,
"enable_client_authentication": false
}
pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
sinapse_conf.json
{
"edge_props": [
{
"name": "relacao",
"type": "string"
}
],
"db_engine": "HBASE",
"vertex_props": [
{
"name": "nome",
"type": "string"
},
{
"name": "cpf",
"type": "string"
}
],
"format": "pg",
"name": "sinapse",
"error_handling": {},
"vertex_id_type": "long",
"attributes": {},
"loading": {},
"zk_quorum": "bda1node05,bda1node06,bda1node07"
}
start-script ran just fine with that, preloaded our hbase graph, works like a charm.
Connected to the server using the pgx client:
./bin/pgx -b http://localhost:7007
And managed to do the same we did in the groovy shell.
That's awesome.
PGX on Yarn
Well, now we are back in our challenge: run and manage PGX on Yarn.
We've copied our pgx.conf file to the hdfs, like this:
hdfs://user/pgx/pgx.conf
{
"allow_idle_timeout_overwrite": true,
"allow_local_filesystem": false,
"allow_task_timeout_overwrite": true,
"enable_gm_compiler": true,
"enterprise_scheduler_config": {
"analysis_task_config": {
"priority": "MEDIUM",
"weight": 12,
"max_threads": 12
},
"fast_analysis_task_config": {
"priority": "HIGH",
"weight": 1,
"max_threads": 12
},
"num_io_threads_per_task": 12
},
"preload_graphs": [
{"path": "graphs/sinapse_conf.json",
"name": "sinapse"}
],
"max_active_sessions": 1024,
"max_queue_size_per_session": -1,
"max_snapshot_count": 0,
"memory_cleanup_interval": 600,
"path_to_gm_compiler": null,
"release_memory_threshold": 0.85,
"session_idle_timeout_secs": 0,
"session_task_timeout_secs": 0,
"strict_mode": true,
"tmp_dir": "/tmp"
}
/opt/oracle/oracle-spatial-graph/property_graph/pgx/yarn/conf/yarn.conf
{
"pgx_yarn_jar_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar",
"pgx_war_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war",
"pgx_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/pgx.conf",
"pgx_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/log4j2.xml",
"pgx_dist_log4j_conf_hdfs_path": "hdfs://mpmapas-ns/user/pgx/dist_log4j.xml",
"pgx_cluster_host_hdfs_path": "hdfs://mpmapas-ns/user/pgx/cluster-host.tgz",
"zookeeper_connect_string": "bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br",
"standard_library_path": "/usr/lib64/gcc/4.8.2",
"min_heap_size": "512m",
"max_heap_size": "12g",
"container_cores": 9,
"container_memory": 0,
"container_priority": 0,
"num_machines": 1
}
Also, @albert recomended us to remove the log4j2.xml from the server/shared-mem/pgx-webapp-2.7.1.war file so we may handle log4j logging using only the file placed on our hdfs folder.
So we've unpacked, removed, repacked the war file, edited the log4j2.xml file on hdfs like this:
hdfs://user/pgx/log4j2.xml
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%d{HH:mm:ss,SSS} %p %C{1} - %m%n"/>
</Console>
<File name="LogFile" fileName="file:/tmp/pg_trace.log">
<PatternLayout pattern="%d{HH:mm:ss.SSS} [%t] %-5level %logger{36} - %msg%n"/>
</File>
</Appenders>
<Loggers>
<Root level="debug">
<AppenderRef ref="LogFile"/>
</Root>
<Logger name="oracle.pgx.engine.admin.Ctrl" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
<Logger name="pgx.dist.cluster_host" level="debug">
<AppenderRef ref="LogFile"/>
</Logger>
</Loggers>
</Configuration>
And finally ran the yarn start server command, just like this:
yarn jar yarn/pgx-yarn-2.7.1.jar yarn/conf/yarn.conf
And we get the bottom of the logfile that seems realy nice!:
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:os.version=4.1.12-124.14.1.el7uek.x86_64
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.name=root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/oracle/oracle-spatial-graph/property_graph/pgx
18/12/11 16:25:03 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br sessionTimeout=10000 watcher=oracle.pgx.yarn.ClientZkClient@32da97fd
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Opening socket connection to server bda1node07.pgj.rj.gov.br/192.168.8.7:2181. Will not attempt to authenticate using SASL (unknown error)
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.8.5:33299, server: bda1node07.pgj.rj.gov.br/192.168.8.7:2181
18/12/11 16:25:03 INFO zookeeper.ClientCnxn: Session establishment complete on server bda1node07.pgj.rj.gov.br/192.168.8.7:2181, sessionid = 0x3668759ae4553df, negotiated timeout = 10000
18/12/11 16:25:05 INFO yarn.StartService: waiting for PGX service (yarn appId == 'application_1539869144089_2555') to come up ...
18/12/11 16:25:10 INFO yarn.StartService: retrieved PGX host: http://bda1node07.pgj.rj.gov.br:7007
18/12/11 16:25:10 INFO yarn.StartService: to connect a remote shell to this host, run '$PGX_HOME/bin/pgx --base_url http://bda1node07.pgj.rj.gov.br:7007'
18/12/11 16:25:10 INFO yarn.StartService: to shut the PGX service down, run 'yarn application -kill application_1539869144089_2555'
18/12/11 16:25:10 INFO zookeeper.ZooKeeper: Session: 0x3668759ae4553df closed
18/12/11 16:25:10 INFO zookeeper.ClientCnxn: EventThread shut down
But connecting to it still returns 404 ;(
The last intel I may give you is the yarn stderr log, wich also informs that we are not using log4j correctly:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.14.2-1.cdh5.14.2.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/u09/hadoop/yarn/nm/filecache/890/pgx-yarn-2.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console. Set system property 'log4j2.debug' to show Log4j2 internal initialization logging.
18/12/11 16:25:06 INFO yarn.AppMaster: register app
18/12/11 16:25:06 INFO yarn.AppMaster: RM response = [queue=root.users.root,maxCap=<memory:65536, vCores:9>]
18/12/11 16:25:06 INFO yarn.AppMaster: max capability of cluster: <memory:65536, vCores:9>
18/12/11 16:25:06 INFO yarn.AppMaster: attempting to allocate 1 containers
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 1: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 2: got 0 containers. Available: <memory:194560, vCores:180>
18/12/11 16:25:06 INFO yarn.AppMaster: attempt 3: got 1 containers. Available: <memory:129024, vCores:171>
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-yarn-2.7.1.jar into pgx-yarn.jar
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx-webapp-2.7.1.war into pgx-server.war
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/pgx.conf into conf/pgx.conf
18/12/11 16:25:06 INFO yarn.AppMaster: copy hdfs://mpmapas-ns/user/pgx/log4j2.xml into conf/log4j2.xml
18/12/11 16:25:07 INFO yarn.AppMaster: server env = {CLASSPATH=conf:pgx-server/WEB-INF/lib/*:pgx-yarn.jar:$HADOOP_CONF_DIR}
18/12/11 16:25:07 INFO yarn.AppMaster: server command = $JAVA_HOME/bin/java -Xms512m -Xmx12g oracle.pgx.yarn.PgxService bda1node07.pgj.rj.gov.br $PWD/pgx-server.war 7007 bda1node05.pgj.rj.gov.br,bda1node06.pgj.rj.gov.br,bda1node07.pgj.rj.gov.br /pgx-37a121ce-e028-432c-8761-104027126c3b 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr;
18/12/11 16:25:07 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:08 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:09 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:10 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:11 INFO yarn.AppMaster: check for completion
18/12/11 16:25:12 INFO yarn.AppMaster: check for completion
.
.
.
This is the farthest we've managed to go.
We can start our work now! That's realy exciting.
Now I know how to properly start a service, preload, insert, manage data, and we will import our existing graph database to it and do some experimentation.
Would be lovely to have this running on Yarn at the production level.
Thank you all for the extreme dedication and attention.
answered Dec 11 '18 at 18:32
SamambaSamamba
387
387
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53284215%2foracle-pgx-on-yarn-404-on-webservice%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
is there any useful output when running
yarn logs -applicationId <appId>
?– Korbi
Nov 16 '18 at 20:33
1
another thing you can try is playing with container_cores and container_memory setting in yarn.conf - try to set them to a small value to make sure YARN doesn't request more capacity than is available, which could cause the service to never be deployed. I think setting it to 0 means maximum available cores/CPU capacity
– Korbi
Nov 20 '18 at 19:52
@Korbi thank you so much for you considerations. Sorry for the delay in the response, I was busy in the last weeks and was not following this thread, I apologize for that. Today we had a meeting with Adriano (from Oracle Brazil) and Albert (Oracle PO), I think you may know them. I'll check your considerations tomorrow first thing in the morning. Thanks in advance!
– Samamba
Dec 6 '18 at 20:13
Just to confirm: when you start the PGX server manually - using the pgx/bin/start-server script, does the server start successfully ? And are you then able to connect from the client, when running it on the BDA too ?
– Albert Godfrind
Dec 7 '18 at 18:13
@AlbertGodfrind following our meeting, had just done what you recomended: opened the groovy shell, connected to the hbase datatabase, created some vertices, created some edges, successfully instantiated a pgx analyst, did some basic operations (count triangles, etc), everything worked fine. Edited the conf/server.conf file, disabled tls and authentication, started the PGX server and it seems to be running and listening to the 7007 port and now i'm strugling a little to connect without ssl with the bin/pgx client. Everything on the oracle BDA
– Samamba
Dec 10 '18 at 19:45